Javascript must be enabled to continue!

Formalizing Bottlenecks in Task-Based OpenMP Applications

Task support was introduced into OpenMP to address irregular parallelism in shared memory architectures. Creating tasks that are extremely fine granular in applications, however, impedes performance. In this paper, a methodology for analyzing the performance of task-based OpenMP programs and its implementation in Periscope is presented. The paper unveils and concentrates on the newly formulated high-level performance properties that formalize typical performance bottlenecks of task-based programs. In addition, the paper reports on the experimental results which were accomplished for the codes of the Barcelona OpenMP Tasks Suite (BOTS) using Periscope in the SuperMUC supercomputing machine at Garching, Germany.

IOS Press

Benedict Shajulin Gerndt Michael Gudu Diana-Mihaela

Advances in Parallel Computing

2025

Title: Formalizing Bottlenecks in Task-Based OpenMP Applications

Description:

Task support was introduced into OpenMP to address irregular parallelism in shared memory architectures.

Creating tasks that are extremely fine granular in applications, however, impedes performance.

In this paper, a methodology for analyzing the performance of task-based OpenMP programs and its implementation in Periscope is presented.

The paper unveils and concentrates on the newly formulated high-level performance properties that formalize typical performance bottlenecks of task-based programs.

In addition, the paper reports on the experimental results which were accomplished for the codes of the Barcelona OpenMP Tasks Suite (BOTS) using Periscope in the SuperMUC supercomputing machine at Garching, Germany.

Back

Related Results

High-level compiler analysis for OpenMP

Nowadays, applications from dissimilar domains, such as high-performance computing and high-integrity systems, require levels of performance that can only be achieved by means of s...

Towards a safe and efficient OpenMP

(English) The growing complexity of contemporary multi-core and heterogeneous architectures necessitates parallel programming models capable of efficiently leveraging the available...

Automatic Parallelization for Heterogeneous Embedded Systems

Parallélisation automatique pour systèmes hétérogènes embarqués L'utilisation d'architectures hétérogènes, combinant des processeurs multicoeurs avec des accélérate...

Scheduler guided OpenMP execution in cloud VMs

Exécution OpenMP guidée par le ordonnanceur dans les machines virtuelles cloud OpenMP est un cadre largement utilisé pour paralléliser les applications, permettant ...

Efficient Parallel Linked List Processing

OpenMP is a very popular and successful parallel programming API, but efficient parallel traversal of a list (of possibly unknown size) of items linked by pointers is a challenging...

Maximize Asset Utilization by Effectively Identifying and Removing Bottlenecks

Abstract ADNOC Gas Processing Ruwais NGL Fractionation plant receives and fractionates the NGL produced in upstream gas processing plants. After operation of newly d...

Performance evaluation of NEMO4.2 with Paraver

The last release of the NEMO v4.2 ocean model includes many modifications that have a significant impact on the model performance. The goal of the work is to assess NEMO performanc...

Towards a Performance Engineering Workflow for OpenMP 4.0

Parallel programming and performance optimization of parallel programs are not simple tasks. Various HPC and OpenMP courses as well as literature serve as introduction to this topi...

Email:
Password:

Email: