Javascript must be enabled to continue!
Methodical Evaluation of Processing-in-Memory Alternatives
View through CrossRef
In this work, I characterized a series of potential application kernels using a set of architectural and non-architectural metrics, and performed a comparison of four different alternatives for processing-in-memory cores (PIMs): ARM cores, GPGPUs, coarse-grained reconfigurable dataflow (DF-PIM), and a domain specific architecture using SIMD PIM engine consisting of a series of multiply-accumulate circuits (MACs). For each PIM alternative I investigated how performance and energy efficiency changes with respect to a series of system parameters, such as memory bandwidth and latency, number of PIM cores, DVFS states, cache architecture, etc. In addition, I compared the PIM core choices for a subset of applications and discussed how the application characteristics correlate to the achieved performance and energy efficiency. Furthermore, I compared the PIM alternatives to a host-centric solution that uses a traditional server-class CPU core or PIM-like cores acting as host-side accelerators instead of being part of 3D-stacked memories. Such insights can expose the achievable performance limits and shortcomings of certain PIM designs and show sensitivity to a series of system parameters (available memory bandwidth, application latency and bandwidth sensitivity, etc.). In addition, identifying the common application characteristics for PIM kernels provides opportunity to identify similar types of computation patterns in other applications and allows us to create a set of applications which can then be used as benchmarks for evaluating future PIM design alternatives.
Title: Methodical Evaluation of Processing-in-Memory Alternatives
Description:
In this work, I characterized a series of potential application kernels using a set of architectural and non-architectural metrics, and performed a comparison of four different alternatives for processing-in-memory cores (PIMs): ARM cores, GPGPUs, coarse-grained reconfigurable dataflow (DF-PIM), and a domain specific architecture using SIMD PIM engine consisting of a series of multiply-accumulate circuits (MACs).
For each PIM alternative I investigated how performance and energy efficiency changes with respect to a series of system parameters, such as memory bandwidth and latency, number of PIM cores, DVFS states, cache architecture, etc.
In addition, I compared the PIM core choices for a subset of applications and discussed how the application characteristics correlate to the achieved performance and energy efficiency.
Furthermore, I compared the PIM alternatives to a host-centric solution that uses a traditional server-class CPU core or PIM-like cores acting as host-side accelerators instead of being part of 3D-stacked memories.
Such insights can expose the achievable performance limits and shortcomings of certain PIM designs and show sensitivity to a series of system parameters (available memory bandwidth, application latency and bandwidth sensitivity, etc.
).
In addition, identifying the common application characteristics for PIM kernels provides opportunity to identify similar types of computation patterns in other applications and allows us to create a set of applications which can then be used as benchmarks for evaluating future PIM design alternatives.
Related Results
Theoretical study of laser-cooled SH<sup>–</sup> anion
Theoretical study of laser-cooled SH<sup>–</sup> anion
The potential energy curves, dipole moments, and transition dipole moments for the <inline-formula><tex-math id="M13">\begin{document}${{\rm{X}}^1}{\Sigma ^ + }$\end{do...
Structuring Augmented Reality Information on the stemua.science
Structuring Augmented Reality Information on the stemua.science
It is demonstrated that one of the conditions for successful scientific and pedagogical work is exchanging of methodical materials, including with using of augmented reality. We pr...
Education and Professional Development Training of Methodologists: Based on the Research Materials
Education and Professional Development Training of Methodologists: Based on the Research Materials
The article is devoted to the results of the research “Personnel of Methodologists of the Central Libraries of the Subjects of the Russian Federation”, conducted by the Center for ...
Fiber-cavity enhanced and high-fidelity optical memory in cold atom ensemble
Fiber-cavity enhanced and high-fidelity optical memory in cold atom ensemble
Entanglement between a photon and an atomic memory is an important tool for quantum repeater research. By using the Duan-Lukin-Cirac-Zoller (DLCZ) process in the atomic ensemble, q...
Design and Performance Evaluation of SRAM Processing in Memory Using TSMC 90nm CMOS Technology
Design and Performance Evaluation of SRAM Processing in Memory Using TSMC 90nm CMOS Technology
Memory is a crucial component in electronic circuits, especially in embedded devices. With the rapid development of AI and Machine Learning, the demand for processing large amounts...
Experimental study of efficient temporal-multimode Duan-Lukin-Cirac-Zoller storage scheme
Experimental study of efficient temporal-multimode Duan-Lukin-Cirac-Zoller storage scheme
<sec>Quantum interfaces that generate entanglement or correlations between a photon and an atomic memory are fundamental building blocks in quantum repeater research. Tempora...
Shared Histories in Multiethnic Societies: Literature as a Critical Corrective of Cultural Memory Studies
Shared Histories in Multiethnic Societies: Literature as a Critical Corrective of Cultural Memory Studies
AbstractThe staging of history in literature is engaged in dynamic exchange with society’s memory discourses and in this context, literature is generally seen as playing a creative...
"Best Tradition": CREATE, JCSEE and the Program Evaluation Standards
"Best Tradition": CREATE, JCSEE and the Program Evaluation Standards
Background: Evaluation “is a task in the best tradition of the most abstract theoretical science as well as the most practical applied science” (Scriven, 1968, p .9). The Program E...

