C++ SIMD parallelism with Intel Cilk Plus and OpenMP 4.0

Performance is one of the most important aspects that comes to mind if deciding for a programming language. Utilizing performance of modern processors is not as straight forward as it has been decades ago. Modern processors only rarely improve serial execution of applications by increasing their frequency or adding more execution units. Nowadays, efficiency aspects and physical limitations dominate the design of processors. Performance improvements of such processors is now defined by one key paradigm: parallelism Parallelism is primarily associated with multi-threading and their methods like tasks, actor model, etc. Over the years this has been made available by various libraries (Boost, Intel Threading Building Blocks, etc.) and is even integral part of C++11 now. Those allow better utilization of the (high) amount of cores provided by modern processors (aka. Multi-/Many-Core processors). Another but less known aspect of parallelism is hidden inside each of such modern cores: SIMD (Single Instruction Multiple Data) execution. It''''s characterized by operating multiple data elements with one single instruction instead of one instruction per element. This can increase performance by a magnitude, depending on the underlying data types and algorithms. Combining it with multi-threading can improve performance further by using the SIMD execution with all cores. However, compared to multi-threading it requires more efforts for utilizing this kind of parallelism. Basically, it is subject of two limitations: (processor) architecture dependence and high level language control. Intel is solving those limitations by providing techniques as part of Intel Cilk Plus and the latest OpenMP 4.0 standard. The talk demonstrates the advantages of those techniques by using different C++ examples. Furthermore their applications are described in more detail, including the latest status of integration into different compilers (Intel C++ Compiler, GNU GCC & LLVM).

Speaker: Georg Zitzlsberger

Slides: C++ SIMD parallelism with Intel Cilk Plus and OpenMP 4.0


Go back