Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Parallelizing and optimizing large‐scale 3D multi‐phase flow simulations on the Tianhe‐2 supercomputer

View through CrossRef
SummaryThe lattice Boltzmann method (LBM) is a widely used computational fluid dynamics method for flow problems with complex geometries and various boundary conditions. Large‐scale LBM simulations with increasing resolution and extending temporal range require massive high‐performance computing (HPC) resources, thus motivating us to port it onto modern many‐core heterogeneous supercomputers like Tianhe‐2. Although many‐core accelerators such as graphics processing unit and Intel MIC have a dramatic advantage of floating‐point performance and power efficiency over CPUs, they also pose a tough challenge to parallelize and optimize computational fluid dynamics codes on large‐scale heterogeneous system.In this paper, we parallelize and optimize the open source 3D multi‐phase LBM code openlbmflow on the Intel Xeon Phi (MIC) accelerated Tianhe‐2 supercomputer using a hybrid and heterogeneous MPI+OpenMP+Offload+single instruction, mulitple data (SIMD) programming model. With cache blocking and SIMD‐friendly data structure transformation, we dramatically improve the SIMD and cache efficiency for the single‐thread performance on both CPU and Phi, achieving a speedup of 7.9X and 8.8X, respectively, compared with the baseline code. To collaborate CPUs and Phi processors efficiently, we propose a load‐balance scheme to distribute workloads among intra‐node two CPUs and three Phi processors and use an asynchronous model to overlap the collaborative computation and communication as far as possible. The collaborative approach with two CPUs and three Phi processors improves the performance by around 3.2X compared with the CPU‐only approach. Scalability tests show that openlbmflow can achieve a parallel efficiency of about 60% on 2048 nodes, with about 400K cores in total. To the best of our knowledge, this is the largest scale CPU‐MIC collaborative LBM simulation for 3D multi‐phase flow problems. Copyright © 2015 John Wiley & Sons, Ltd.
Title: Parallelizing and optimizing large‐scale 3D multi‐phase flow simulations on the Tianhe‐2 supercomputer
Description:
SummaryThe lattice Boltzmann method (LBM) is a widely used computational fluid dynamics method for flow problems with complex geometries and various boundary conditions.
Large‐scale LBM simulations with increasing resolution and extending temporal range require massive high‐performance computing (HPC) resources, thus motivating us to port it onto modern many‐core heterogeneous supercomputers like Tianhe‐2.
Although many‐core accelerators such as graphics processing unit and Intel MIC have a dramatic advantage of floating‐point performance and power efficiency over CPUs, they also pose a tough challenge to parallelize and optimize computational fluid dynamics codes on large‐scale heterogeneous system.
In this paper, we parallelize and optimize the open source 3D multi‐phase LBM code openlbmflow on the Intel Xeon Phi (MIC) accelerated Tianhe‐2 supercomputer using a hybrid and heterogeneous MPI+OpenMP+Offload+single instruction, mulitple data (SIMD) programming model.
With cache blocking and SIMD‐friendly data structure transformation, we dramatically improve the SIMD and cache efficiency for the single‐thread performance on both CPU and Phi, achieving a speedup of 7.
9X and 8.
8X, respectively, compared with the baseline code.
To collaborate CPUs and Phi processors efficiently, we propose a load‐balance scheme to distribute workloads among intra‐node two CPUs and three Phi processors and use an asynchronous model to overlap the collaborative computation and communication as far as possible.
The collaborative approach with two CPUs and three Phi processors improves the performance by around 3.
2X compared with the CPU‐only approach.
Scalability tests show that openlbmflow can achieve a parallel efficiency of about 60% on 2048 nodes, with about 400K cores in total.
To the best of our knowledge, this is the largest scale CPU‐MIC collaborative LBM simulation for 3D multi‐phase flow problems.
Copyright © 2015 John Wiley & Sons, Ltd.

Related Results

Large-Scale Heterogeneous Computing for 3D Deterministic Particle Transport on Tianhe-2A Supercomputer
Large-Scale Heterogeneous Computing for 3D Deterministic Particle Transport on Tianhe-2A Supercomputer
Scalable parallel algorithm for particle transport is one of the main application fields in high-performance computing. Discrete ordinate method (Sn) is one of the most popular det...
Multiphase Flow Metering:An Evaluation of Discharge Coefficients
Multiphase Flow Metering:An Evaluation of Discharge Coefficients
Abstract The orifice discharge coefficient (CD) is the constant required to correct theoretical flow rate to actual flow rate. It is known that single phase orifi...
Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system
Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system
AbstractDue to the increase in computing power, it is possible to improve the feature extraction and data fitting capabilities of DNN networks by increasing their depth and model c...
A New Mathematical Model for EOR Displacements Honouring Oil Ganglia
A New Mathematical Model for EOR Displacements Honouring Oil Ganglia
Abstract During two-phase flow in porous media non-wetting phase is present simultaneously in states of mobile connected continuum and of trapped isolated ganglia...
On Robust and Efficient Parallel Reservoir Simulation on Tianhe-2
On Robust and Efficient Parallel Reservoir Simulation on Tianhe-2
Abstract Parallel reservoir simulators are now widely used with availability of super computers. Modern massively parallel supercomputers demonstrate great power for...
An Architecture-Aware Heterogeneous Multigrid Solver for Geodynamic Simulations on the New-Generation Tianhe Supercomputer
An Architecture-Aware Heterogeneous Multigrid Solver for Geodynamic Simulations on the New-Generation Tianhe Supercomputer
Large-scale mantle convection simulations repeatedly solve sparse velocity-pressure systems, and the multigrid velocity solver often dominates the total runtime. This paper present...
Kaji efisiensi temperatur penukar panas dengan variasi aliran untuk aplikasi pengering
Kaji efisiensi temperatur penukar panas dengan variasi aliran untuk aplikasi pengering
Abstrak Heat exchanger atau alat penukar panas adalah alat-alat yang digunakan untuk mengubah temperatur fluida atau mengubah fasa fluida dengan cara mempertukarkan panasnya dengan...

Back to Top