Javascript must be enabled to continue!
Performance and Programming Environment of a Combined GPU/FPGA Desktop
View through CrossRef
The performance and the versatility of today's PCs exceeds many times the power of the fastest number crunchers in the 90s. Yet the computational hunger of many scientific applications has led to the development of GPU- and FPGA-accelerator cards. In this paper the programming environment and the performance analysis of a super desktop with a combined GPU/FPGA architecture is presented. A unified roofline model is used to compare the performance of the GPU and the FPGA taking into account the computational intensity of the algorithm and the resource consumption. The model is validated by two image processing kernels which are compiled using OpenCL for the GPU and a C-to-VHDL compiler for the FPGA. It is shown that an FPGA compiler outperforms handwritten code and is highly productive, but also uses more resources. While both the GPU and FPGA excel in particular applications, both devices suffer from the limited I/O bandwidth to the processor.
Title: Performance and Programming Environment of a Combined GPU/FPGA Desktop
Description:
The performance and the versatility of today's PCs exceeds many times the power of the fastest number crunchers in the 90s.
Yet the computational hunger of many scientific applications has led to the development of GPU- and FPGA-accelerator cards.
In this paper the programming environment and the performance analysis of a super desktop with a combined GPU/FPGA architecture is presented.
A unified roofline model is used to compare the performance of the GPU and the FPGA taking into account the computational intensity of the algorithm and the resource consumption.
The model is validated by two image processing kernels which are compiled using OpenCL for the GPU and a C-to-VHDL compiler for the FPGA.
It is shown that an FPGA compiler outperforms handwritten code and is highly productive, but also uses more resources.
While both the GPU and FPGA excel in particular applications, both devices suffer from the limited I/O bandwidth to the processor.
Related Results
Method of QoS evaluation of FPGA as a service
Method of QoS evaluation of FPGA as a service
The subject of study in this article is the evaluation of the performance issues of cloud services implemented using FPGA technology. The goal is to improve the performance of clou...
Аналіз застосування технологій ПЛІС в складі IoT
Аналіз застосування технологій ПЛІС в складі IoT
The subject of study in this article and work is the modern technologies of programmable logic devices (PLD) classified as FPGA, and the peculiarities of its application in Interne...
Methods of Deployment and Evaluation of FPGA as a Service Under Conditions of Changing Requirements and Environments
Methods of Deployment and Evaluation of FPGA as a Service Under Conditions of Changing Requirements and Environments
Applying Field Programmable Gate Array (FPGA) technology in cloud infrastructure and heterogeneous computations is of great interest today. FPGA as a Service assumes that the progr...
Vina-GPU 2.1: towards further optimizing docking speed and precision of AutoDock Vina and its derivatives
Vina-GPU 2.1: towards further optimizing docking speed and precision of AutoDock Vina and its derivatives
AbstractAutoDock Vina and its derivatives have established themselves as a prevailing pipeline for virtual screening in contemporary drug discovery. Our Vina-GPU method leverages t...
Comparación de enfoques de desarrollo HDL y HLL en FPGA para aplicaciones de procesamiento de imágenes
Comparación de enfoques de desarrollo HDL y HLL en FPGA para aplicaciones de procesamiento de imágenes
Desde su invención a medidados de los 90, las FPGA han destacado por su gran poder de cómputo, bajo consumo energético y alta flexibilidad al reconfigurar su arquitectura interna p...
Vina-GPU 2.0:further accelerating AutoDock Vina and its derivatives with GPUs
Vina-GPU 2.0:further accelerating AutoDock Vina and its derivatives with GPUs
Modern drug discovery typically faces large virtual screens from huge compound databases where multiple docking tools are involved for meeting various real scenes or improving the ...
GPU-I-TASSER: a GPU accelerated I-TASSER protein structure prediction tool
GPU-I-TASSER: a GPU accelerated I-TASSER protein structure prediction tool
Abstract
Motivation
Accurate and efficient predictions of protein structures play an important role in understanding their funct...
Parallel garment drape simulation of triangular mesh using GPU programming
Parallel garment drape simulation of triangular mesh using GPU programming
PurposeThe purpose of this paper is to determine the possibility of implementing parallel processing feature of graphic processor unit (GPU) in garment drape simulation.Design/meth...


