Javascript must be enabled to continue!

Сравнение стратегий распараллеливания векторизованного римановского решателя с помощью OpenMP для микропроцессора Intel Xeon Phi KNL

Римановские решатели широко используются в численных методах, при решении задач газовой динамики. При этом во время проведения вычислений требуется решать задачу Римана о распаде произвольного разрыва на каждой итерации расчетов для каждой пары соседних ячеек расчетной сетки. Таким образом, требуется иметь эффективную реализацию римановских решателей. В данной статье рассматривается задача о распараллеливании с помощью OpenMP применения векторизованного точного римановского решателя к массивам входных данных (массивы газодинамических параметров слева и справа от разрыва). Рассматриваются три различные стратегии распараллеливания. Рассматриваемые стратегии распараллеливания были реализованы в программном коде и протестированы на 72-ядерных микропроцессорах Intel Xeon Phi KNL. В результате проведенных численных экспериментов для наиболее эффективной стратегии распараллеливания было получено суммарное ускорение векторизованного римановского решателя в 368 раз (при использовании 139 потоков) на одном микропроцессоре Intel Xeon Phi KNL по сравнению с однопоточной невекторизованной версией решателя. Riemann solvers are widely used in numerical methods, in the solution of gas dynamics problems. During the calculations, it is required to solve the Riemann problem on the decay of an arbitrary discontinuity at each iteration of the calculations for each pair of neighboring cells of the computational grid. Thus, an efficient implementation of Riemann solvers is required. This article discusses the problem of parallelization using OpenMP of applying a vectorized exact Riemann solver to arrays of input data (arrays of gas dynamic parameters to the left and right of the discontinuity). Three different parallelization strategies are considered. The considered parallelization strategies were implemented in software code and tested on 72-core Intel Xeon Phi KNL microprocessors. As a result of numerical experiments for the most efficient parallelization strategy, the total acceleration of the vectorized Riemann solver by a factor of 368 (using 139 threads) was obtained on one Intel Xeon Phi KNL microprocessor in comparison with the single-threaded unvectorized version of the solver.

Federal Scientific Center Scientific Research Institute for Systems Research of the Russian Academy of Sciences

Воробьев М.Ю. Рыбаков А.А. Чопорняк А.Д.

Труды НИИСИ РАН

2024

Title: Сравнение стратегий распараллеливания векторизованного римановского решателя с помощью OpenMP для микропроцессора Intel Xeon Phi KNL

Description:

Римановские решатели широко используются в численных методах, при решении задач газовой динамики.

При этом во время проведения вычислений требуется решать задачу Римана о распаде произвольного разрыва на каждой итерации расчетов для каждой пары соседних ячеек расчетной сетки.

Таким образом, требуется иметь эффективную реализацию римановских решателей.

В данной статье рассматривается задача о распараллеливании с помощью OpenMP применения векторизованного точного римановского решателя к массивам входных данных (массивы газодинамических параметров слева и справа от разрыва).

Рассматриваются три различные стратегии распараллеливания.

Рассматриваемые стратегии распараллеливания были реализованы в программном коде и протестированы на 72-ядерных микропроцессорах Intel Xeon Phi KNL.

В результате проведенных численных экспериментов для наиболее эффективной стратегии распараллеливания было получено суммарное ускорение векторизованного римановского решателя в 368 раз (при использовании 139 потоков) на одном микропроцессоре Intel Xeon Phi KNL по сравнению с однопоточной невекторизованной версией решателя.

Riemann solvers are widely used in numerical methods, in the solution of gas dynamics problems.

During the calculations, it is required to solve the Riemann problem on the decay of an arbitrary discontinuity at each iteration of the calculations for each pair of neighboring cells of the computational grid.

Thus, an efficient implementation of Riemann solvers is required.

This article discusses the problem of parallelization using OpenMP of applying a vectorized exact Riemann solver to arrays of input data (arrays of gas dynamic parameters to the left and right of the discontinuity).

Three different parallelization strategies are considered.

The considered parallelization strategies were implemented in software code and tested on 72-core Intel Xeon Phi KNL microprocessors.

As a result of numerical experiments for the most efficient parallelization strategy, the total acceleration of the vectorized Riemann solver by a factor of 368 (using 139 threads) was obtained on one Intel Xeon Phi KNL microprocessor in comparison with the single-threaded unvectorized version of the solver.

Back

Related Results

High-level compiler analysis for OpenMP

Nowadays, applications from dissimilar domains, such as high-performance computing and high-integrity systems, require levels of performance that can only be achieved by means of s...

HPC-BLAST: Distributed BLAST for Modern HPC Clusters.

The near exponential growth in sequence data available to bioinformaticists, and the emergence of new fields of biological research, continue to fuel an incessant need for in- crea...

Mitotic chromosome condensation requires phosphorylation of the centromeric protein KNL-2 in C. elegans

ABSTRACT Centromeres are chromosomal regions that serve as sites for kinetochore formation and microtubule attachment, processes that are essenti...

Towards a safe and efficient OpenMP

(English) The growing complexity of contemporary multi-core and heterogeneous architectures necessitates parallel programming models capable of efficiently leveraging the available...

LU Factorisation on Xeon and Xeon Phi Processors

This paper outlines the parallelisation and vectorisation methods we have used to port a LU decomposition library to the Xeon Phi co-processor. We ported a LU factorisation algorit...

Automatic Parallelization for Heterogeneous Embedded Systems

Parallélisation automatique pour systèmes hétérogènes embarqués L'utilisation d'architectures hétérogènes, combinant des processeurs multicoeurs avec des accélérate...

Un manoscritto equivocato del copista santo Theophilos († 1548)

<p><font size="3"><span class="A1"><span style="font-family: 'Times New Roman','serif'">ΕΝΑ ΛΑΝ&...

Abstract 1627: PHI-501, a novel and potent pan-RAF inhibitor in metastatic melanoma

Abstract Background: PHI-501 has been developed as a novel inhibitor of NRAS mutated acute myeloid leukemia. Big data and artificial intelligence (AI)-based drug dis...

Email:
Password:

Email: