Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Optimizing Soft Vector Processing in FPGA-Based Embedded Systems

View through CrossRef
Soft vector processors can augment and extend the capability of FPGA-based embedded systems-on-chip such as the Xilinx Zynq. However, configuring and optimizing the soft processor for best performance is hard. We must consider architectural parameters such as precision, vector lane count, vector length, chunk size, and DMA scheduling to ensure efficient execution of code on the soft vector processing platform. To simplify the design process, we develop a compiler framework and an autotuning runtime that splits the optimization into a combination of static and dynamic passes that map data-parallel computations to the soft processor. We compare and contrast implementations running on the scalar ARM processor, the embedded NEON hard vector engine, and low-level streaming Verilog designs with the VectorBlox MXP soft vector processor. Across a range of data-parallel benchmarks, we show that the MXP soft vector processor can outperform other organizations by up to 4 × while saving ≈ 10% dynamic power. Our compilation and runtime framework is also able to outperform the gcc NEON vectorizer under certain conditions by explicit generation of NEON intrinsics and performance tuning of the autogenerated data-parallel code. When constrained by IO bandwidth, soft vector processors are even competitive with spatial Verilog implementations of computation.
Title: Optimizing Soft Vector Processing in FPGA-Based Embedded Systems
Description:
Soft vector processors can augment and extend the capability of FPGA-based embedded systems-on-chip such as the Xilinx Zynq.
However, configuring and optimizing the soft processor for best performance is hard.
We must consider architectural parameters such as precision, vector lane count, vector length, chunk size, and DMA scheduling to ensure efficient execution of code on the soft vector processing platform.
To simplify the design process, we develop a compiler framework and an autotuning runtime that splits the optimization into a combination of static and dynamic passes that map data-parallel computations to the soft processor.
We compare and contrast implementations running on the scalar ARM processor, the embedded NEON hard vector engine, and low-level streaming Verilog designs with the VectorBlox MXP soft vector processor.
Across a range of data-parallel benchmarks, we show that the MXP soft vector processor can outperform other organizations by up to 4 × while saving ≈ 10% dynamic power.
Our compilation and runtime framework is also able to outperform the gcc NEON vectorizer under certain conditions by explicit generation of NEON intrinsics and performance tuning of the autogenerated data-parallel code.
When constrained by IO bandwidth, soft vector processors are even competitive with spatial Verilog implementations of computation.

Related Results

Method of QoS evaluation of FPGA as a service
Method of QoS evaluation of FPGA as a service
The subject of study in this article is the evaluation of the performance issues of cloud services implemented using FPGA technology. The goal is to improve the performance of clou...
Аналіз застосування технологій ПЛІС в складі IoT
Аналіз застосування технологій ПЛІС в складі IoT
The subject of study in this article and work is the modern technologies of programmable logic devices (PLD) classified as FPGA, and the peculiarities of its application in Interne...
Between the Classes of Soft Open Sets and Soft Omega Open Sets
Between the Classes of Soft Open Sets and Soft Omega Open Sets
In this paper, we define the class of soft ω0-open sets. We show that this class forms a soft topology that is strictly between the classes of soft open sets and soft ω-open sets, ...
Weaker Forms of Soft Regular and Soft T2 Soft Topological Spaces
Weaker Forms of Soft Regular and Soft T2 Soft Topological Spaces
Soft ω-local indiscreetness as a weaker form of both soft local countability and soft local indiscreetness is introduced. Then soft ω-regularity as a weaker form of both soft regul...
Methods of Deployment and Evaluation of FPGA as a Service Under Conditions of Changing Requirements and Environments
Methods of Deployment and Evaluation of FPGA as a Service Under Conditions of Changing Requirements and Environments
Applying Field Programmable Gate Array (FPGA) technology in cloud infrastructure and heterogeneous computations is of great interest today. FPGA as a Service assumes that the progr...
Soft Complete Continuity and Soft Strong Continuity in Soft Topological Spaces
Soft Complete Continuity and Soft Strong Continuity in Soft Topological Spaces
In this paper, we introduce soft complete continuity as a strong form of soft continuity and we introduce soft strong continuity as a strong form of soft complete continuity. Sever...
Comparación de enfoques de desarrollo HDL y HLL en FPGA para aplicaciones de procesamiento de imágenes
Comparación de enfoques de desarrollo HDL y HLL en FPGA para aplicaciones de procesamiento de imágenes
Desde su invención a medidados de los 90, las FPGA han destacado por su gran poder de cómputo, bajo consumo energético y alta flexibilidad al reconfigurar su arquitectura interna p...
Soft Semi ω-Open Sets
Soft Semi ω-Open Sets
In this paper, we introduce the class of soft semi ω-open sets of a soft topological space (X,τ,A), using soft ω-open sets. We show that the class of soft semi ω-open sets contains...

Back to Top