Javascript must be enabled to continue!
An extensible genome annotation workbench based on the Galaxy Platform
View through CrossRef
Introduction
Falling costs of genetic sequencing have allowed sequencing and annotation of the genomes of non-model organism. In annotating non-model genomes, mRNA-seq data has great potential to improve annotation quality. For example the Asian seabass (L. calcarifer) genome annotation effort drew on previously assembled mRNA-seq data provided by the Temasek Life Sciences Laboratory (TLL, Singapore). At the South African National Bioinformatics Institute (SANBI) we undertook gene annotation on the Asian seabass genome using a pipeline built out of custom scripts. Simultaneously, a team at Saint Petersburg State University undertook the same task using MAKER2. Comparing the results of this annotation highlighted the impact of tool and parameter choice in gene prediction.
The Galaxy framework allows workflows to be constructed in a high level workflow language that hides the system-specific details of their implementation. We implemented genome annotation workflows in Galaxy, demonstrating its suitability for constructing an annotation workbench that incorporates re-usable and replaceable modules.
Conclusion
Repeatable genome analysis workflows allows for reuse of methods and reproducibility of results.
Galaxy workflows allow workflow construction in a flowchart-like idiom similar to the way in which workflows are documented.
We demonstrate the construction of two workflows as part of a larger annotation workbench and their use in the annotation of the L. calcarifer genome.
Exposing results through Jupyter notebooks and export to browsers such as JBrowse and the (SANBI-authored) Bass Explorer allows results to be examined seamlessly within Galaxy.
Future Work
We intend to implement further workflows to provide a complete Galaxy-based eukaryotic genome annotation workbench.
We will enhance tools that export to an annotation database browser modelled on the Bass Explorer.
Title: An extensible genome annotation workbench based on the Galaxy Platform
Description:
Introduction
Falling costs of genetic sequencing have allowed sequencing and annotation of the genomes of non-model organism.
In annotating non-model genomes, mRNA-seq data has great potential to improve annotation quality.
For example the Asian seabass (L.
calcarifer) genome annotation effort drew on previously assembled mRNA-seq data provided by the Temasek Life Sciences Laboratory (TLL, Singapore).
At the South African National Bioinformatics Institute (SANBI) we undertook gene annotation on the Asian seabass genome using a pipeline built out of custom scripts.
Simultaneously, a team at Saint Petersburg State University undertook the same task using MAKER2.
Comparing the results of this annotation highlighted the impact of tool and parameter choice in gene prediction.
The Galaxy framework allows workflows to be constructed in a high level workflow language that hides the system-specific details of their implementation.
We implemented genome annotation workflows in Galaxy, demonstrating its suitability for constructing an annotation workbench that incorporates re-usable and replaceable modules.
Conclusion
Repeatable genome analysis workflows allows for reuse of methods and reproducibility of results.
Galaxy workflows allow workflow construction in a flowchart-like idiom similar to the way in which workflows are documented.
We demonstrate the construction of two workflows as part of a larger annotation workbench and their use in the annotation of the L.
calcarifer genome.
Exposing results through Jupyter notebooks and export to browsers such as JBrowse and the (SANBI-authored) Bass Explorer allows results to be examined seamlessly within Galaxy.
Future Work
We intend to implement further workflows to provide a complete Galaxy-based eukaryotic genome annotation workbench.
We will enhance tools that export to an annotation database browser modelled on the Bass Explorer.
Related Results
Galaxy Genome Annotation: Galaxy as a platform for the annotation of genomes
Galaxy Genome Annotation: Galaxy as a platform for the annotation of genomes
Galaxy Genome Annotation (GGA) is a project focusing on developments and resources to turn Galaxy into a complete and efficient platform for the structural and functional annotatio...
Galaxy Genome Annotation: Galaxy as a platform for the annotation of genomes
Galaxy Genome Annotation: Galaxy as a platform for the annotation of genomes
Galaxy Genome Annotation (GGA) is a project focusing on developments and resources to turn Galaxy into a complete and efficient platform for the structural and functional annotatio...
Laniakea: an open solution to provide Galaxy “on-demand” instances over heterogeneous cloud infrastructures
Laniakea: an open solution to provide Galaxy “on-demand” instances over heterogeneous cloud infrastructures
Abstract
Background
Galaxy is rapidly becoming the de facto standard among workflow managers for bioinformatics. A rich feature...
The COMBAT-TB Workbench: Making powerful TB bioinformatics accessible
The COMBAT-TB Workbench: Making powerful TB bioinformatics accessible
Abstract
Whole Genome Sequencing (WGS) is a powerful method for detecting drug resistance, genetic diversity and transmission dynamics of
...
Pangeo for everyone with Galaxy
Pangeo for everyone with Galaxy
<p>Pangeo has been deployed on a number of diverse infrastructures and learning resources are available with for instance the Pangeo Tutorial Gallery (http://gallery....
Introducing a new notification system in Galaxy
Introducing a new notification system in Galaxy
We are excited to introduce a new notification system for Galaxy that provides users with increased awareness about things happening in Galaxy. This system includes several feature...
Deployment of genome databases for insects using Galaxy Genome Annotation
Deployment of genome databases for insects using Galaxy Genome Annotation
BIPAA is a bioinformatics platform from the French National Institute for Agricultural Research (INRA). It is located in Rennes (France), and it is dedicated to the study of insect...
Viral pathogen data analysis with Galaxy
Viral pathogen data analysis with Galaxy
The success of the GalaxyProject SARS-CoV-2 analysis effort during the COVID-19 pandemic has boosted
interest in Galaxy as a platform for viral pathogen data analys...

