Javascript must be enabled to continue!
EGA: An Efficient GPU Accelerated Groupby Aggregation Algorithm
View through CrossRef
With the exponential growth of big data, efficient groupby aggregation (GA) has become critical for real-time analytics across industries. GA is a key method for extracting valuable information. Current CPU-based solutions (such as large-scale parallel processing platforms) face computational throughput limitations. Since CPU-based platforms struggle to support real-time big data analysis, the GPU is introduced to support real-time GA analysis. Most GPU GA algorithms are based on hashing methods, and these algorithms experience performance degradation when the load factor of the hash table is too high or when the data volume exceeds the GPU memory capacity limit. This paper proposes an efficient hash-based GPU-accelerated groupby aggregation algorithm (EGA) that addresses these limitations. EGA features different designs for different scenarios: single-pass EGA (SP-EGA) maintains high efficiency when data fit in the GPU memory, while multipass EGA (MP-EGA) supports GA for data exceeding the GPU memory capacity. EGA demonstrates significant acceleration: SP-EGA outperforms SOTA hash-based GPU algorithms by 1.16–5.39× at load factors >0.90 and surpasses SOTA sort-based GPU methods by 1.30–2.48×. MP-EGA achieves 6.45–29.12× speedup over SOTA CPU implementations.
Title: EGA: An Efficient GPU Accelerated Groupby Aggregation Algorithm
Description:
With the exponential growth of big data, efficient groupby aggregation (GA) has become critical for real-time analytics across industries.
GA is a key method for extracting valuable information.
Current CPU-based solutions (such as large-scale parallel processing platforms) face computational throughput limitations.
Since CPU-based platforms struggle to support real-time big data analysis, the GPU is introduced to support real-time GA analysis.
Most GPU GA algorithms are based on hashing methods, and these algorithms experience performance degradation when the load factor of the hash table is too high or when the data volume exceeds the GPU memory capacity limit.
This paper proposes an efficient hash-based GPU-accelerated groupby aggregation algorithm (EGA) that addresses these limitations.
EGA features different designs for different scenarios: single-pass EGA (SP-EGA) maintains high efficiency when data fit in the GPU memory, while multipass EGA (MP-EGA) supports GA for data exceeding the GPU memory capacity.
EGA demonstrates significant acceleration: SP-EGA outperforms SOTA hash-based GPU algorithms by 1.
16–5.
39× at load factors >0.
90 and surpasses SOTA sort-based GPU methods by 1.
30–2.
48×.
MP-EGA achieves 6.
45–29.
12× speedup over SOTA CPU implementations.
Related Results
P183 NEW METHOD OF ESOPHAGO-GASTRO ANASTOMOSIS WITHIN MINIMALLY INVASIVE HYBRID IVOR LEWIS PROCEDURE
P183 NEW METHOD OF ESOPHAGO-GASTRO ANASTOMOSIS WITHIN MINIMALLY INVASIVE HYBRID IVOR LEWIS PROCEDURE
Abstract
Aim
Improve the results of surgical treatment of esophageal cancer by developing and implementing a new method for the ...
Vina-GPU 2.1: towards further optimizing docking speed and precision of AutoDock Vina and its derivatives
Vina-GPU 2.1: towards further optimizing docking speed and precision of AutoDock Vina and its derivatives
AbstractAutoDock Vina and its derivatives have established themselves as a prevailing pipeline for virtual screening in contemporary drug discovery. Our Vina-GPU method leverages t...
GPU-I-TASSER: a GPU accelerated I-TASSER protein structure prediction tool
GPU-I-TASSER: a GPU accelerated I-TASSER protein structure prediction tool
Abstract
Motivation
Accurate and efficient predictions of protein structures play an important role in understanding their funct...
Vina-GPU 2.0:further accelerating AutoDock Vina and its derivatives with GPUs
Vina-GPU 2.0:further accelerating AutoDock Vina and its derivatives with GPUs
Modern drug discovery typically faces large virtual screens from huge compound databases where multiple docking tools are involved for meeting various real scenes or improving the ...
Oxygen levels regulating embryonic genome activation
Oxygen levels regulating embryonic genome activation
<p dir="ltr">Involuntary infertility affects some 18% of the world's population. Assisted reproductive technologies (ART), including in vitro fertilization (IVF), are for man...
Oxygen levels regulating embryonic genome activation
Oxygen levels regulating embryonic genome activation
<p dir="ltr">Involuntary infertility affects some 18% of the world's population. Assisted reproductive technologies (ART), including in vitro fertilization (IVF), are for man...
Accelerated hydrologic modeling: ParFlow GPU implementation
Accelerated hydrologic modeling: ParFlow GPU implementation
<p>&#160; ParFlow is known as a numerical model that simulates the hydrologic cycle from the bedrock to the top of the plant canopy. The original codebase pro...
PS01.150: NEW METHOD OF ESOPHAGO-GASTRO ANASTOMOSIS AFTER ESOPHAGECTOMY
PS01.150: NEW METHOD OF ESOPHAGO-GASTRO ANASTOMOSIS AFTER ESOPHAGECTOMY
Abstract
Background
Medical science has introduced a lot of innovations and advanced equipment since the first esophagectomy was...

