Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

SYCL-BLAS: Combining Expression Trees and Kernel Fusion on Heterogeneous Systems

View through CrossRef
The support for heterogenous platforms requires multiple specialised devices collaborate to execute an application. The SYCL standard publishes by Khronos, providing a C++ abstraction layer on top of OpenCL that provides single-source programming for a large number of heterogeneous devices. Single-source programming and task data-flow approach enable SYCL developers to leverage modern programming techniques on heterogeneous platforms. In this paper, we present how SYCL combines expression tree templates and kernel fusion to develop SYCL-BLAS, an efficient BLAS implementation for heterogeneous platforms. The use of templates permits to generate BLAS kernels related to each BLAS routine. whereas kernel fusion describes how to merge the expression trees, enlarging the BLAS kernels. These features prove that SYCL can be used to quickly develop libraries for heterogeneous systems by providing sufficient levels of abstraction. Our experiments compare the performances of clBLAS and SYCL-BLAS on a server equipped with an Intel Core i7-6700K CPU and an AMD R9 GPU.
Title: SYCL-BLAS: Combining Expression Trees and Kernel Fusion on Heterogeneous Systems
Description:
The support for heterogenous platforms requires multiple specialised devices collaborate to execute an application.
The SYCL standard publishes by Khronos, providing a C++ abstraction layer on top of OpenCL that provides single-source programming for a large number of heterogeneous devices.
Single-source programming and task data-flow approach enable SYCL developers to leverage modern programming techniques on heterogeneous platforms.
In this paper, we present how SYCL combines expression tree templates and kernel fusion to develop SYCL-BLAS, an efficient BLAS implementation for heterogeneous platforms.
The use of templates permits to generate BLAS kernels related to each BLAS routine.
whereas kernel fusion describes how to merge the expression trees, enlarging the BLAS kernels.
These features prove that SYCL can be used to quickly develop libraries for heterogeneous systems by providing sufficient levels of abstraction.
Our experiments compare the performances of clBLAS and SYCL-BLAS on a server equipped with an Intel Core i7-6700K CPU and an AMD R9 GPU.

Related Results

The Nuclear Fusion Award
The Nuclear Fusion Award
The Nuclear Fusion Award ceremony for 2009 and 2010 award winners was held during the 23rd IAEA Fusion Energy Conference in Daejeon. This time, both 2009 and 2010 award winners w...
Physicochemical Properties of Wheat Fractionated by Wheat Kernel Thickness and Separated by Kernel Specific Density
Physicochemical Properties of Wheat Fractionated by Wheat Kernel Thickness and Separated by Kernel Specific Density
ABSTRACTTwo wheat cultivars, soft white winter wheat Yang‐mai 11 and hard white winter wheat Zheng‐mai 9023, were fractionated by kernel thickness into five sections; the fractiona...
Genetic Variation in Potential Kernel Size Affects Kernel Growth and Yield of Sorghum
Genetic Variation in Potential Kernel Size Affects Kernel Growth and Yield of Sorghum
Large‐seededness can increase grain yield in sorghum [Sorghum bicolor (L.) Moench] if larger kernel size more than compensates for the associated reduction in kernel number. The ai...
Sorghum Kernel Weight
Sorghum Kernel Weight
The influence of genotype and panicle position on sorghum [Sorghum bicolor (L.) Moench] kernel growth is poorly understood. In the present study, sorghum kernel weight (KW) differe...
SYCL: Single-source C++ accelerator programming
SYCL: Single-source C++ accelerator programming
Hybrid systems have been massively adopted in high performance clusters and scientific applications. The latest Top500 [1] HPC list shows an increased number of heterogeneous proce...
Performant Automatic BLAS Offloading on Unified Memory Architecture with OpenMP First-Touch Style Data Movement
Performant Automatic BLAS Offloading on Unified Memory Architecture with OpenMP First-Touch Style Data Movement
BLAS is a fundamental building block of advanced linear algebra libraries and many modern scientific computing applications. GPU is known for its strong arithmetic computing capabi...
Nonproliferation and fusion power plants
Nonproliferation and fusion power plants
Abstract The world now appears to be on the brink of realizing commercial fusion. As fusion energy progresses towards near-term commercial deployment, the question arises a...
Geographical variation in Canarium indicum (Burseraceae) nut characteristics across Vanuatu
Geographical variation in Canarium indicum (Burseraceae) nut characteristics across Vanuatu
Abstract Tropical forests in the Pacific region contain many tree species that bear edible nuts (kernels). Canarium indicum (canarium) is an overstorey tree indigenous to ...

Back to Top