Javascript must be enabled to continue!
Rapid multiple protein sequence search by parallel and heterogeneous computation
View through CrossRef
Abstract
Motivation
Protein sequence database search and multiple sequence alignment generation is a fundamental task in many bioinformatics analyses. As the data volume of sequences continues to grow rapidly, there is an increasing need for efficient and scalable multiple sequence query algorithms for super-large databases without expensive time and computational costs.
Results
We introduce Chorus, a novel protein sequence query system that leverages parallel model and heterogeneous computation architecture to enable users to query thousands of protein sequences concurrently against large protein databases on a desktop workstation. Chorus achieves over 100× speedup over BLASTP without sacrificing sensitivity. We demonstrate the utility of Chorus through a case study of analyzing a ∼1.5-TB large-scale metagenomic datasets for novel CRISPR-Cas protein discovery within 30 min.
Availability and implementation
Chorus is open-source and its code repository is available at https://github.com/Bio-Acc/Chorus.
Oxford University Press (OUP)
Title: Rapid multiple protein sequence search by parallel and heterogeneous computation
Description:
Abstract
Motivation
Protein sequence database search and multiple sequence alignment generation is a fundamental task in many bioinformatics analyses.
As the data volume of sequences continues to grow rapidly, there is an increasing need for efficient and scalable multiple sequence query algorithms for super-large databases without expensive time and computational costs.
Results
We introduce Chorus, a novel protein sequence query system that leverages parallel model and heterogeneous computation architecture to enable users to query thousands of protein sequences concurrently against large protein databases on a desktop workstation.
Chorus achieves over 100× speedup over BLASTP without sacrificing sensitivity.
We demonstrate the utility of Chorus through a case study of analyzing a ∼1.
5-TB large-scale metagenomic datasets for novel CRISPR-Cas protein discovery within 30 min.
Availability and implementation
Chorus is open-source and its code repository is available at https://github.
com/Bio-Acc/Chorus.
Related Results
Blunt Chest Trauma and Chylothorax: A Systematic Review
Blunt Chest Trauma and Chylothorax: A Systematic Review
Abstract
Introduction: Although traumatic chylothorax is predominantly associated with penetrating injuries, instances following blunt trauma, as a rare and challenging condition, ...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract
The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...
Complex Collision Tumors: A Systematic Review
Complex Collision Tumors: A Systematic Review
Abstract
Introduction: A collision tumor consists of two distinct neoplastic components located within the same organ, separated by stromal tissue, without histological intermixing...
ERROR ESTIMATION FOR A PIEZOELECTRIC CONTACT PROBLEM WITH WEAR AND LONG MEMORY
ERROR ESTIMATION FOR A PIEZOELECTRIC CONTACT PROBLEM WITH WEAR AND LONG MEMORY
We study a mathematical model for a quasistatic behavior of electro-viscoelastic materials. The problem is related to highly nonlinear and non-smooth phenomena like contact, fricti...
TINGKAT PROTEIN DAN LISIN DALAM RANSUM TERHADAP EFISIENSI LISIN DAN PROTEIN NETTO PADA AYAM KAMPUNG UMUR 12 MINGGU
TINGKAT PROTEIN DAN LISIN DALAM RANSUM TERHADAP EFISIENSI LISIN DAN PROTEIN NETTO PADA AYAM KAMPUNG UMUR 12 MINGGU
Penelitian yang dilakukan ini dalam mencari pengaruh tingkat protein dan lisin terhadap efisiensi lisin dan penggunaan protein netto pada ayam kampung yang diperlihara sampai umur ...
A Descriptive Study on Interconnection Networks for Parallel Computing and Algorithm Models in Parallel Computing
A Descriptive Study on Interconnection Networks for Parallel Computing and Algorithm Models in Parallel Computing
In parallel computing, Interconnection networks are very crucial for efficient communication among all processors within a similar system.
Parallel computing has become a crucial t...
A Study on Parallel Computation for 3D Magneto‐Telluric Modeling Using the Staggered‐Grid Finite Difference Method
A Study on Parallel Computation for 3D Magneto‐Telluric Modeling Using the Staggered‐Grid Finite Difference Method
AbstractComputation time and memory requirements are two common problems for magnetotelluric (MT) modeling of three‐dimensional conductivity structure. We develop a new parallel pr...

