Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Two novel cache management mechanisms on CPU-GPU heterogeneous processors

View through CrossRef
Heterogeneous multicore processors that take full advantage of CPUs and GPUs within the samechip raise an emerging challenge for sharing a series of on-chip resources, particularly Last-LevelCache (LLC) resources. Since the GPU core has good parallelism and memory latency tolerance,the majority of the LLC space is utilized by GPU applications. Under the current cache managementpolicies, the LLC sharing of CPU applications can be remarkably decreased due to the existence ofGPU workloads, thus seriously affecting the overall performance. To alleviate the unfair contentionwithin CPUs and GPUs for the cache capability, we propose two novel cache supervision mechanisms:static cache partitioning scheme based on adaptive replacement policy (SARP) and dynamiccache partitioning scheme based on GPU missing awareness (DGMA). SARP scheme first uses cachepartitioning to split the cache ways between CPUs and GPUs and then uses adaptive cache replacementpolicy depending on the type of the requested message. DGMA scheme monitors GPU’s cacheperformance metrics at run time and set appropriate threshold to dynamically change the cache ratioof the mutual LLC between various kernels. Experimental results show that SARP mechanismcan further increase CPU performance, up to 32.6% and an average increase of 8.4%. And DGMAscheme improves CPU performance under the premise of ensuring that GPU performance is not affected,and achieves a maximum increase of 18.1% and an average increase of 7.7%.
Title: Two novel cache management mechanisms on CPU-GPU heterogeneous processors
Description:
Heterogeneous multicore processors that take full advantage of CPUs and GPUs within the samechip raise an emerging challenge for sharing a series of on-chip resources, particularly Last-LevelCache (LLC) resources.
Since the GPU core has good parallelism and memory latency tolerance,the majority of the LLC space is utilized by GPU applications.
Under the current cache managementpolicies, the LLC sharing of CPU applications can be remarkably decreased due to the existence ofGPU workloads, thus seriously affecting the overall performance.
To alleviate the unfair contentionwithin CPUs and GPUs for the cache capability, we propose two novel cache supervision mechanisms:static cache partitioning scheme based on adaptive replacement policy (SARP) and dynamiccache partitioning scheme based on GPU missing awareness (DGMA).
SARP scheme first uses cachepartitioning to split the cache ways between CPUs and GPUs and then uses adaptive cache replacementpolicy depending on the type of the requested message.
DGMA scheme monitors GPU’s cacheperformance metrics at run time and set appropriate threshold to dynamically change the cache ratioof the mutual LLC between various kernels.
Experimental results show that SARP mechanismcan further increase CPU performance, up to 32.
6% and an average increase of 8.
4%.
And DGMAscheme improves CPU performance under the premise of ensuring that GPU performance is not affected,and achieves a maximum increase of 18.
1% and an average increase of 7.
7%.

Related Results

An Efficient Software-Managed Cache Based on Cell Broadband Engine Architecture
An Efficient Software-Managed Cache Based on Cell Broadband Engine Architecture
While the CBEA (Cell Broadband Engine Architecture) offers substantial computational power, its explicit multilevel memory hierarchy poses significant challenges to traditional pro...
A Hierarchical Cache Architecture-Oriented Cache Management Scheme for Information-Centric Networking
A Hierarchical Cache Architecture-Oriented Cache Management Scheme for Information-Centric Networking
Information-Centric Networking (ICN) typically utilizes DRAM (Dynamic Random Access Memory) to build in-network cache components due to its high data transfer rate and low latency....
Performant Automatic BLAS Offloading on Unified Memory Architecture with OpenMP First-Touch Style Data Movement
Performant Automatic BLAS Offloading on Unified Memory Architecture with OpenMP First-Touch Style Data Movement
BLAS is a fundamental building block of advanced linear algebra libraries and many modern scientific computing applications. GPU is known for its strong arithmetic computing capabi...
Coordinated Energy Management in Heterogeneous Processors
Coordinated Energy Management in Heterogeneous Processors
This paper examines energy management in a heterogeneous processor consisting of an integrated CPU–GPU for high-performance computing (HPC) applications. Energy management for HPC ...
Vina-GPU 2.1: towards further optimizing docking speed and precision of AutoDock Vina and its derivatives
Vina-GPU 2.1: towards further optimizing docking speed and precision of AutoDock Vina and its derivatives
AbstractAutoDock Vina and its derivatives have established themselves as a prevailing pipeline for virtual screening in contemporary drug discovery. Our Vina-GPU method leverages t...
Unlocking the Power of Parallel Computing: GPU technologies for Ocean Forecasting
Unlocking the Power of Parallel Computing: GPU technologies for Ocean Forecasting
Abstract. Operational ocean forecasting systems are complex engines that must execute ocean models with high performance to provide timely products and datasets. Significant comput...
Performance simulation methodologies for hardware/software co-designed processors
Performance simulation methodologies for hardware/software co-designed processors
Recently the community started looking into Hardware/Software (HW/SW) co-designed processors as potential solutions to move towards the less power consuming and the less complex de...
Homology sequence analysis using GPU acceleration
Homology sequence analysis using GPU acceleration
A number of problems in bioinformatics, systems biology and computational biology field require abstracting physical entities to mathematical or computational models. In such studi...

Back to Top