Javascript must be enabled to continue!

Two novel cache management mechanisms on CPU-GPU heterogeneous processors

Heterogeneous multicore processors that take full advantage of CPUs and GPUs within the samechip raise an emerging challenge for sharing a series of on-chip resources, particularly Last-LevelCache (LLC) resources. Since the GPU core has good parallelism and memory latency tolerance,the majority of the LLC space is utilized by GPU applications. Under the current cache managementpolicies, the LLC sharing of CPU applications can be remarkably decreased due to the existence ofGPU workloads, thus seriously affecting the overall performance. To alleviate the unfair contentionwithin CPUs and GPUs for the cache capability, we propose two novel cache supervision mechanisms:static cache partitioning scheme based on adaptive replacement policy (SARP) and dynamiccache partitioning scheme based on GPU missing awareness (DGMA). SARP scheme first uses cachepartitioning to split the cache ways between CPUs and GPUs and then uses adaptive cache replacementpolicy depending on the type of the requested message. DGMA scheme monitors GPU’s cacheperformance metrics at run time and set appropriate threshold to dynamically change the cache ratioof the mutual LLC between various kernels. Experimental results show that SARP mechanismcan further increase CPU performance, up to 32.6% and an average increase of 8.4%. And DGMAscheme improves CPU performance under the premise of ensuring that GPU performance is not affected,and achieves a maximum increase of 18.1% and an average increase of 7.7%.

Centre for Continental Network in Eco-Innovation and Research

Huijing Yang Tingwen Yu

Research Briefs on Information and Communication Technology Evolution

2023

Title: Two novel cache management mechanisms on CPU-GPU heterogeneous processors

Description:

Since the GPU core has good parallelism and memory latency tolerance,the majority of the LLC space is utilized by GPU applications.

Under the current cache managementpolicies, the LLC sharing of CPU applications can be remarkably decreased due to the existence ofGPU workloads, thus seriously affecting the overall performance.

To alleviate the unfair contentionwithin CPUs and GPUs for the cache capability, we propose two novel cache supervision mechanisms:static cache partitioning scheme based on adaptive replacement policy (SARP) and dynamiccache partitioning scheme based on GPU missing awareness (DGMA).

SARP scheme first uses cachepartitioning to split the cache ways between CPUs and GPUs and then uses adaptive cache replacementpolicy depending on the type of the requested message.

DGMA scheme monitors GPU’s cacheperformance metrics at run time and set appropriate threshold to dynamically change the cache ratioof the mutual LLC between various kernels.

Experimental results show that SARP mechanismcan further increase CPU performance, up to 32.

6% and an average increase of 8.

4%.

And DGMAscheme improves CPU performance under the premise of ensuring that GPU performance is not affected,and achieves a maximum increase of 18.

1% and an average increase of 7.

7%.

Back

While the CBEA (Cell Broadband Engine Architecture) offers substantial computational power, its explicit multilevel memory hierarchy poses significant challenges to traditional pro...

A Hierarchical Cache Architecture-Oriented Cache Management Scheme for Information-Centric Networking

Information-Centric Networking (ICN) typically utilizes DRAM (Dynamic Random Access Memory) to build in-network cache components due to its high data transfer rate and low latency....

Performant Automatic BLAS Offloading on Unified Memory Architecture with OpenMP First-Touch Style Data Movement

BLAS is a fundamental building block of advanced linear algebra libraries and many modern scientific computing applications. GPU is known for its strong arithmetic computing capabi...

Coordinated Energy Management in Heterogeneous Processors

This paper examines energy management in a heterogeneous processor consisting of an integrated CPU–GPU for high-performance computing (HPC) applications. Energy management for HPC ...

Vina-GPU 2.1: towards further optimizing docking speed and precision of AutoDock Vina and its derivatives

Abstract AutoDock Vina and its derivatives have established themselves as a prevailing pipeline for virtual screening in contemporary drug discov...

Towards a Software Transactional Memory for Heterogeneous CPU-GPU Processors

The heterogeneous Accelerated Processing Units (APUs) integrate a multi-core CPU and a GPU within the same chip. Modern APUs provide the programmer with platform atomics, used to c...

Unlocking the Power of Parallel Computing: GPU technologies for Ocean Forecasting

Abstract. Operational ocean forecasting systems are complex engines that must execute ocean models with high performance to provide timely products and datasets. Significant comput...

Performance simulation methodologies for hardware/software co-designed processors

Recently the community started looking into Hardware/Software (HW/SW) co-designed processors as potential solutions to move towards the less power consuming and the less complex de...

Email:
Password:

Email:

Two novel cache management mechanisms on CPU-GPU heterogeneous processors

Related Results