Javascript must be enabled to continue!
SYCL in the Edge: Performance Evaluation for Heterogeneous Acceleration
View through CrossRef
Abstract
Edge computing is essential to handle increasing data volumes and processing capacities. It provides real-time, secure data processing near data sources, like smart devices, alleviating cloud computing energy use and saving network band-width. Specialized accelerators, like GPUs and FPGAs, are vital for low-latency edge computing but the requirements to customized code for different hardware and vendors supposes important compatibility issues. This paper evaluates the potential of SYCL in addressing code portability issues encountered in edge computing. We employed the Polybench suite to compare various SYCL implementations, specifically DPC++ and AdaptiveCpp, with the native solution, CUDA. The disparity between SYCL implementations was negligible, at just 5%.
Furthermore, we evaluated SYCL in the context of specific edge computing applications such as video processing using three different optical flow algorithms. The results exposed a potential performance gap of 19% between CUDA and SYCL. This performance differential is the price one may need to pay when achieving the ability to successfully run the same code on two distinct edge boards with four different architectures, including x86/64 CPU, ARM CPU, NVIDIA GPU, and Intel GPU. These findings underscore SYCL’s capacity to increase productivity in term of development costs and facilitate the IoT deployment without being locked into a particular platform or manufacturer.
Title: SYCL in the Edge: Performance Evaluation for Heterogeneous Acceleration
Description:
Abstract
Edge computing is essential to handle increasing data volumes and processing capacities.
It provides real-time, secure data processing near data sources, like smart devices, alleviating cloud computing energy use and saving network band-width.
Specialized accelerators, like GPUs and FPGAs, are vital for low-latency edge computing but the requirements to customized code for different hardware and vendors supposes important compatibility issues.
This paper evaluates the potential of SYCL in addressing code portability issues encountered in edge computing.
We employed the Polybench suite to compare various SYCL implementations, specifically DPC++ and AdaptiveCpp, with the native solution, CUDA.
The disparity between SYCL implementations was negligible, at just 5%.
Furthermore, we evaluated SYCL in the context of specific edge computing applications such as video processing using three different optical flow algorithms.
The results exposed a potential performance gap of 19% between CUDA and SYCL.
This performance differential is the price one may need to pay when achieving the ability to successfully run the same code on two distinct edge boards with four different architectures, including x86/64 CPU, ARM CPU, NVIDIA GPU, and Intel GPU.
These findings underscore SYCL’s capacity to increase productivity in term of development costs and facilitate the IoT deployment without being locked into a particular platform or manufacturer.
Related Results
SYCL-BLAS: Combining Expression Trees and Kernel Fusion on Heterogeneous Systems
SYCL-BLAS: Combining Expression Trees and Kernel Fusion on Heterogeneous Systems
The support for heterogenous platforms requires multiple specialised devices collaborate to execute an application. The SYCL standard publishes by Khronos, providing a C++ abstract...
AI-driven zero-touch orchestration of edge-cloud services
AI-driven zero-touch orchestration of edge-cloud services
(English) 6G networks demand orchestration systems capable of managing thousands of distributed microservices under sub-millisecond latency constraints. Traditional centralized app...
Magic graphs
Magic graphs
DE LA TESIS<br/>Si un graf G admet un etiquetament super edge magic, aleshores G es diu que és un graf super edge màgic. La tesis està principalment enfocada a l'estudi del c...
SYCL: Single-source C++ accelerator programming
SYCL: Single-source C++ accelerator programming
Hybrid systems have been massively adopted in high performance clusters and scientific applications. The latest Top500 [1] HPC list shows an increased number of heterogeneous proce...
Non-Recommended Publishing Lists: Strategies for Detecting Deceitful Journals
Non-Recommended Publishing Lists: Strategies for Detecting Deceitful Journals
Abstract
The rapid growth of open access publishing (OAP) has significantly improved the accessibility and dissemination of scientific knowledge. However, this expansion has also c...
Product of digraphs, (super) edge-magic valences and related problems
Product of digraphs, (super) edge-magic valences and related problems
Discrete Mathematics, and in particular Graph Theory, has gained a lot of popularity during the last 7 decades. Among the many branches in Graph Theory, graph labelings has experim...
Broad Flight Envelope Acceleration Control Method of Aero-Engine Based on Multiperiod Optimization Strategy
Broad Flight Envelope Acceleration Control Method of Aero-Engine Based on Multiperiod Optimization Strategy
Abstract
A new method based on a multiperiod optimization strategy is proposed to address the limited aero-engine acceleration performance across a broad flight enve...
THE FORCING EDGE FIXING EDGE-TO-VERTEX MONOPHONIC NUMBER OF A GRAPH
THE FORCING EDGE FIXING EDGE-TO-VERTEX MONOPHONIC NUMBER OF A GRAPH
For a connected graph G = (V, E), a set Se ⊆ E(G)–{e} is called an edge fixing edge-to-vertex monophonic set of an edge e of a connected graph G if every vertex of G lies on an e –...

