Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Formalizing Bottlenecks in Task-Based OpenMP Applications

View through CrossRef
Task support was introduced into OpenMP to address irregular parallelism in shared memory architectures. Creating tasks that are extremely fine granular in applications, however, impedes performance. In this paper, a methodology for analyzing the performance of task-based OpenMP programs and its implementation in Periscope is presented. The paper unveils and concentrates on the newly formulated high-level performance properties that formalize typical performance bottlenecks of task-based programs. In addition, the paper reports on the experimental results which were accomplished for the codes of the Barcelona OpenMP Tasks Suite (BOTS) using Periscope in the SuperMUC supercomputing machine at Garching, Germany.
Title: Formalizing Bottlenecks in Task-Based OpenMP Applications
Description:
Task support was introduced into OpenMP to address irregular parallelism in shared memory architectures.
Creating tasks that are extremely fine granular in applications, however, impedes performance.
In this paper, a methodology for analyzing the performance of task-based OpenMP programs and its implementation in Periscope is presented.
The paper unveils and concentrates on the newly formulated high-level performance properties that formalize typical performance bottlenecks of task-based programs.
In addition, the paper reports on the experimental results which were accomplished for the codes of the Barcelona OpenMP Tasks Suite (BOTS) using Periscope in the SuperMUC supercomputing machine at Garching, Germany.

Related Results

High-level compiler analysis for OpenMP
High-level compiler analysis for OpenMP
Nowadays, applications from dissimilar domains, such as high-performance computing and high-integrity systems, require levels of performance that can only be achieved by means of s...
Towards a safe and efficient OpenMP
Towards a safe and efficient OpenMP
(English) The growing complexity of contemporary multi-core and heterogeneous architectures necessitates parallel programming models capable of efficiently leveraging the available...
Automatic Parallelization for Heterogeneous Embedded Systems
Automatic Parallelization for Heterogeneous Embedded Systems
Parallélisation automatique pour systèmes hétérogènes embarqués L'utilisation d'architectures hétérogènes, combinant des processeurs multicoeurs avec des accélérate...
Scheduler guided OpenMP execution in cloud VMs
Scheduler guided OpenMP execution in cloud VMs
Exécution OpenMP guidée par le ordonnanceur dans les machines virtuelles cloud OpenMP est un cadre largement utilisé pour paralléliser les applications, permettant ...
Efficient Parallel Linked List Processing
Efficient Parallel Linked List Processing
OpenMP is a very popular and successful parallel programming API, but efficient parallel traversal of a list (of possibly unknown size) of items linked by pointers is a challenging...
Maximize Asset Utilization by Effectively Identifying and Removing Bottlenecks
Maximize Asset Utilization by Effectively Identifying and Removing Bottlenecks
Abstract ADNOC Gas Processing Ruwais NGL Fractionation plant receives and fractionates the NGL produced in upstream gas processing plants. After operation of newly d...
Performance evaluation of NEMO4.2 with Paraver
Performance evaluation of NEMO4.2 with Paraver
The last release of the NEMO v4.2 ocean model includes many modifications that have a significant impact on the model performance. The goal of the work is to assess NEMO performanc...
Towards a Performance Engineering Workflow for OpenMP 4.0
Towards a Performance Engineering Workflow for OpenMP 4.0
Parallel programming and performance optimization of parallel programs are not simple tasks. Various HPC and OpenMP courses as well as literature serve as introduction to this topi...

Back to Top