Latest Developments of OpenMP 4.0 & 4.5 including OpenMP Offload Model
The goal of this tutorial is to help researchers and application developers to get the most of their highly concurrent, heterogeneous computers using the latest advances of the OpenMP standard. We will put a special emphasis on how to harness accelerators/offloading engines such as GPUs using OpenMP constructs. We will demonstrate the wide range of offloading constructs, starting from naïve offloading construct requiring no more than one pragma to offload computations to a GPU, to more sophisticated patterns that let the user in control of data migration to lessen the impact of communication between a host node and its offloading engine. Other topics will be covered as well, from exploiting SIMD level parallelism, high-thread concurrency in deep computer hierarchies where locality of data and computation matters, constructs to better support searching through big data, and more. We will provide hands on examples, and conclude with a discussion on future areas of interest to the OpenMP standard.
Alexandre E. Eichenberer (IBM & OpenMP Accelerator Subcommittee member)
Eric Stotzer (TI & OpenMP Accelerator Subcommittee Co-Chair)
|Tutorial Slides (PPOPP17-openmp-devices.pdf)||9.69MiB|