Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Memory-driven computing accelerates genomic data processing

View through CrossRef
Next generation sequencing (NGS) is the driving force behind precision medicine and is revolutionizing most, if not all, areas of the life sciences. Particularly when targeting the major common diseases, an exponential growth of NGS data is foreseen for the next decades. This enormous increase of NGS data and the need to process the data quickly for real-world applications requires to rethink our current compute infrastructures. Here we provide evidence that memory-driven computing (MDC), a novel memory-centric hardware architecture, is an attractive alternative to current processor-centric compute infrastructures. To illustrate how MDC can change NGS data handling, we used RNA-seq assembly and pseudoalignment followed by quantification as two first examples. Adapting transcriptome assembly pipelines for MDC reduced compute time by 5.9-fold for the first step (SAMtools). Even more impressive, pseudoalignment by near-optimal probabilistic RNA-seq quantification (kallisto) was accelerated by more than two orders of magnitude with identical accuracy and indicated 66% reduced energy consumption. One billion RNA-seq reads were processed in just 92 seconds. Clearly, MDC simultaneously reduces data processing time and energy consumption. Together with the MDC-inherent solutions for local data privacy, a new compute model can be projected pushing large scale NGS data processing and primary data analytics closer to the edge by directly combining high-end sequencers with local MDC, thereby also reducing movement of large raw data to central cloud storage. We further envision that other data-rich areas will similarly benefit from this new memory-centric compute architecture.
Title: Memory-driven computing accelerates genomic data processing
Description:
Next generation sequencing (NGS) is the driving force behind precision medicine and is revolutionizing most, if not all, areas of the life sciences.
Particularly when targeting the major common diseases, an exponential growth of NGS data is foreseen for the next decades.
This enormous increase of NGS data and the need to process the data quickly for real-world applications requires to rethink our current compute infrastructures.
Here we provide evidence that memory-driven computing (MDC), a novel memory-centric hardware architecture, is an attractive alternative to current processor-centric compute infrastructures.
To illustrate how MDC can change NGS data handling, we used RNA-seq assembly and pseudoalignment followed by quantification as two first examples.
Adapting transcriptome assembly pipelines for MDC reduced compute time by 5.
9-fold for the first step (SAMtools).
Even more impressive, pseudoalignment by near-optimal probabilistic RNA-seq quantification (kallisto) was accelerated by more than two orders of magnitude with identical accuracy and indicated 66% reduced energy consumption.
One billion RNA-seq reads were processed in just 92 seconds.
Clearly, MDC simultaneously reduces data processing time and energy consumption.
Together with the MDC-inherent solutions for local data privacy, a new compute model can be projected pushing large scale NGS data processing and primary data analytics closer to the edge by directly combining high-end sequencers with local MDC, thereby also reducing movement of large raw data to central cloud storage.
We further envision that other data-rich areas will similarly benefit from this new memory-centric compute architecture.

Related Results

Advancements in Quantum Computing and Information Science
Advancements in Quantum Computing and Information Science
Abstract: The chapter "Advancements in Quantum Computing and Information Science" explores the fundamental principles, historical development, and modern applications of quantum co...
Accuracy and computational efficiency of genomic selection with high-density SNP and whole-genome sequence data.
Accuracy and computational efficiency of genomic selection with high-density SNP and whole-genome sequence data.
Abstract The prediction of complex or quantitative traits from single nucleotide polymorphism (SNP) genotypes has transformed livestock and plant breeding, and is also pl...
Tools and techniques for real-time data processing: A review
Tools and techniques for real-time data processing: A review
Real-time data processing is an essential component in the modern data landscape, where vast amounts of data are generated continuously from various sources such as Internet of Thi...
Processing genome-wide association studies within a repository of heterogeneous genomic datasets
Processing genome-wide association studies within a repository of heterogeneous genomic datasets
Abstract Background Genome Wide Association Studies (GWAS) are based on the observation of genome-wide sets of genetic variants – typically single-n...
High Dimensional Computing on Arabic Language Classification
High Dimensional Computing on Arabic Language Classification
Abstract The brain circuit is enormous regarding quantities of neurons and neuro-transmitters, proposing that huge circuits are the main entity to the brain-core processing...
THE IMPACT OF CLOUD COMPUTING ON CONSTRUCTION PROJECT DELIVERY ABUJA NIGERIA
THE IMPACT OF CLOUD COMPUTING ON CONSTRUCTION PROJECT DELIVERY ABUJA NIGERIA
Cloud computing is the delivery of computing services, such as storage, processing power, and software applications, via the internet. Cloud computing offers various advantages and...
Array‐Based Genomics in Glioma Research
Array‐Based Genomics in Glioma Research
AbstractOver the years, several relevant biomarkers with a potential clinical interest have been identified in gliomas using various techniques, such as karyotype, microsatellite a...

Back to Top