Javascript must be enabled to continue!
A CUDA fast multipole method with highly efficient M2L far field evaluation
View through CrossRef
Solving an N-body problem, electrostatic or gravitational, is a crucial task and the main computational bottleneck in many scientific applications. Its direct solution is an ubiquitous showcase example for the compute power of graphics processing units (GPUs). However, the naïve pairwise summation has [Formula: see text] computational complexity. The fast multipole method (FMM) can reduce runtime and complexity to [Formula: see text] for any specified precision. Here, we present a CUDA-accelerated, C++ FMM implementation for multi particle systems with [Formula: see text] potential that are found, e.g. in biomolecular simulations. The algorithm involves several operators to exchange information in an octree data structure. We focus on the Multipole-to-Local (M2L) operator, as its runtime is limiting for the overall performance. We propose, implement and benchmark three different M2L parallelization approaches. Approach (1) utilizes Unified Memory to minimize programming and porting efforts. It achieves decent speedups for only little implementation work. Approach (2) employs CUDA Dynamic Parallelism to significantly improve performance for high approximation accuracies. The presorted list-based approach (3) fits periodic boundary conditions particularly well. It exploits FMM operator symmetries to minimize both memory access and the number of complex multiplications. The result is a compute-bound implementation, i.e. performance is limited by arithmetic operations rather than by memory accesses. The complete CUDA parallelized FMM is incorporated within the GROMACS molecular dynamics package as an alternative Coulomb solver.
Title: A CUDA fast multipole method with highly efficient M2L far field evaluation
Description:
Solving an N-body problem, electrostatic or gravitational, is a crucial task and the main computational bottleneck in many scientific applications.
Its direct solution is an ubiquitous showcase example for the compute power of graphics processing units (GPUs).
However, the naïve pairwise summation has [Formula: see text] computational complexity.
The fast multipole method (FMM) can reduce runtime and complexity to [Formula: see text] for any specified precision.
Here, we present a CUDA-accelerated, C++ FMM implementation for multi particle systems with [Formula: see text] potential that are found, e.
g.
in biomolecular simulations.
The algorithm involves several operators to exchange information in an octree data structure.
We focus on the Multipole-to-Local (M2L) operator, as its runtime is limiting for the overall performance.
We propose, implement and benchmark three different M2L parallelization approaches.
Approach (1) utilizes Unified Memory to minimize programming and porting efforts.
It achieves decent speedups for only little implementation work.
Approach (2) employs CUDA Dynamic Parallelism to significantly improve performance for high approximation accuracies.
The presorted list-based approach (3) fits periodic boundary conditions particularly well.
It exploits FMM operator symmetries to minimize both memory access and the number of complex multiplications.
The result is a compute-bound implementation, i.
e.
performance is limited by arithmetic operations rather than by memory accesses.
The complete CUDA parallelized FMM is incorporated within the GROMACS molecular dynamics package as an alternative Coulomb solver.
Related Results
Parallel Strategy of FMBEM for 3D Elastostatics and its GPU Implementation Using CUDA
Parallel Strategy of FMBEM for 3D Elastostatics and its GPU Implementation Using CUDA
Finite Element Method (FEM1) is pervasively used in most of 3D product design analysis, in which Computer Aided Design (CAD) models need to be converted in to mesh models first and...
Multipole groups and fracton phenomena on arbitrary crystalline lattices
Multipole groups and fracton phenomena on arbitrary crystalline lattices
Multipole symmetries are of interest in multiple contexts, from the study of fracton phases, to nonergodic quantum dynamics, to the exploration of new hydrodynamic universality cla...
Convergence of the Laplace and the alternative multipole expansion approximation series for the Coulomb potential
Convergence of the Laplace and the alternative multipole expansion approximation series for the Coulomb potential
Abstract
Multipole expansion is a powerful technique used in many-body physics to solve dynamical problems involving correlated interactions between constituent particles. ...
MENP: an open-source MATLAB implementation of multipole expansion for nanophotonics
MENP: an open-source MATLAB implementation of multipole expansion for nanophotonics
In modern nanophotonics, multipolar interference plays an indispensable role to realize novel optical devices represented by metasurfaces with unprecedented functionalities. Not on...
Non-Recommended Publishing Lists: Strategies for Detecting Deceitful Journals
Non-Recommended Publishing Lists: Strategies for Detecting Deceitful Journals
Abstract
The rapid growth of open access publishing (OAP) has significantly improved the accessibility and dissemination of scientific knowledge. However, this expansion has also c...
Linear ion traps in mass spectrometry
Linear ion traps in mass spectrometry
Abstract
I.
Introduction
000
II.
Linear Multipoles
000
A. Multipole Fields
000
1. Multipole Potentials
000
2. Ion Motion in 2D Multipole Fields
000...
Multipole Moments Under Square Vortex and Skyrmion Crystals
Multipole Moments Under Square Vortex and Skyrmion Crystals
Non-coplanar spin textures such as magnetic vortices and skyrmions manifest themselves in unusual physical phenomena owing to their topologically nontrivial properties. Here, we in...
The Art of Finding the Optimal Scattering Center(s)
The Art of Finding the Optimal Scattering Center(s)
Abstract
The truncated spatial multipolar spectra enable efficient approximate solutions to acoustic, quantum‐mechanical, and electromagnetic problems. In photoni...

