Javascript must be enabled to continue!
BioWorkbench: a high-performance framework for managing and analyzing bioinformatics experiments
View through CrossRef
Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments. Because these experiments are computation- and data-intensive, they require high-performance computing techniques and can benefit from specialized technologies such as Scientific Workflow Management Systems and databases. In this work, we present BioWorkbench, a framework for managing and analyzing bioinformatics experiments. This framework automatically collects provenance data, including both performance data from workflow execution and data from the scientific domain of the workflow application. Provenance data can be analyzed through a web application that abstracts a set of queries to the provenance database, simplifying access to provenance information. We evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a RASopathy analysis workflow. We analyze each workflow from both computational and scientific domain perspectives, by using queries to a provenance and annotation database. Some of these queries are available as a pre-built feature of the BioWorkbench web application. Through the provenance data, we show that the framework is scalable and achieves high-performance, reducing up to 98% of the case studies execution time. We also show how the application of machine learning techniques can enrich the analysis process.
Title: BioWorkbench: a high-performance framework for managing and analyzing bioinformatics experiments
Description:
Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments.
Because these experiments are computation- and data-intensive, they require high-performance computing techniques and can benefit from specialized technologies such as Scientific Workflow Management Systems and databases.
In this work, we present BioWorkbench, a framework for managing and analyzing bioinformatics experiments.
This framework automatically collects provenance data, including both performance data from workflow execution and data from the scientific domain of the workflow application.
Provenance data can be analyzed through a web application that abstracts a set of queries to the provenance database, simplifying access to provenance information.
We evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a RASopathy analysis workflow.
We analyze each workflow from both computational and scientific domain perspectives, by using queries to a provenance and annotation database.
Some of these queries are available as a pre-built feature of the BioWorkbench web application.
Through the provenance data, we show that the framework is scalable and achieves high-performance, reducing up to 98% of the case studies execution time.
We also show how the application of machine learning techniques can enrich the analysis process.
Related Results
Advancements in Biomedical and Bioinformatics Engineering
Advancements in Biomedical and Bioinformatics Engineering
Abstract: The field of biomedical and bioinformatics engineering is witnessing rapid advancements that are revolutionizing healthcare and medical research. This chapter provides a...
A large-scale analysis of bioinformatics code on GitHub
A large-scale analysis of bioinformatics code on GitHub
AbstractIn recent years, the explosion of genomic data and bioinformatic tools has been accompanied by a growing conversation around reproducibility of results and usability of sof...
New classifications for quantum bioinformatics: Q-bioinformatics, QCt-bioinformatics, QCg-bioinformatics, and QCr-bioinformatics
New classifications for quantum bioinformatics: Q-bioinformatics, QCt-bioinformatics, QCg-bioinformatics, and QCr-bioinformatics
Abstract
Bioinformatics has revolutionized biology and medicine by using computational methods to analyze and interpret biological data. Quantum mechanics has recent...
Bioinformatics tool and web server development focusing on structural bioinformatics applications
Bioinformatics tool and web server development focusing on structural bioinformatics applications
This thesis is divided into two main sections: Part 1 describes the design, and evaluation of the accuracy of a new web server – PRotein Interactive MOdeling (PRIMO-Complexes) for ...
Cloud Computing in Bioinformatics: current solutions and challenges
Cloud Computing in Bioinformatics: current solutions and challenges
Abstract truncated at 3,000 characters - the full version is available in the pdf file
MOTIVATIONS The availability of high-throughput technologies and the application of g...
The Role and Progress of Bioinformatics in Genomics Research
The Role and Progress of Bioinformatics in Genomics Research
With the rapid development of high-throughput sequencing technology, genomics research has entered the era of big data. Bioinformatics, as a bridge connecting biology, computer sci...
Leveraging bioinformatics to enhance multi-sensory environmental art design: Insights from molecular and cellular biomechanics and human experience
Leveraging bioinformatics to enhance multi-sensory environmental art design: Insights from molecular and cellular biomechanics and human experience
With the rapid development of bioinformation technology and its wide application in various fields, its combination with multi-sensory environmental art design provides new possibi...
CREDO: a friendly Customizable, REproducible, DOcker file generator for bioinformatics applications
CREDO: a friendly Customizable, REproducible, DOcker file generator for bioinformatics applications
Abstract
Background
The analysis of large and complex biological datasets in bioinformatics poses a significant challenge to achieving reproducible ...

