Javascript must be enabled to continue!
BGDMdocker: a Docker workflow for analysis and visualization pan-genome and biosynthetic gene clusters of bacterial
View through CrossRef
ABSTRACT
Motivation
At present Docker technology has received increasing level of attention throughout the bioinformatics community. However, its implementation details have not yet been mastered by most biologists and applied widely in biological researches. In order to popularizing this technology in the bioinformatics and sufficiently use plenty of public resources of bioinformatics tools (Dockerfile and image of scommunity, officially and privately) in Docker Hub Registry and other Docker sources based on Docker, we introduced full and accurate instance of a bioinformatics workflow based on Docker to analyse and visualize pan-genome and biosynthetic gene clusters of a bacteria in this article, provided the solutions for mining bioinformatics big data from various public biology databases. You could be guided step-by-step through the workflow process from docker file to build up your own images and run an container fast creating an workflow.
Results
We presented a BGDMdocker (bacterial genome data mining docker-based) workflow based on docker. The workflow consists of three integrated toolkits, Prokka v1.11, panX, and antiSMASH3.0. The dependencies were all written in Dockerfile, to build docker image and run container for analysing pan-genome of total 44
Bacillus amyloliquefaciens
strains, which were retrieved from public? database. The pan-genome totally includes 172,432 gene, 2,306 Core gene cluster. The visualized pan-genomic data such as alignment, phylogenetic trees, maps mutations within that cluster to the branches of the tree, infers loss and gain of genes on the core-genome phylogeny for each gene cluster were presented. Besides, 997 known (MIBiG database) and 553 unknown (antiSMASH-predicted clusters and Pfam database) genes of biosynthesis gene clusters types and orthologous groups were mined in all strains. This workflow could also be used for other species pan-genome analysis and visualization. The display of visual data can completely duplicated as well as done in this paper. All result data and relevant tools and files can be downloaded from our website with no need to register. The pan-genome and biosynthetic gene clusters analysis and visualization can be fully reusable immediately in different computing platforms (Linux, Windows, Mac and deployed in the cloud), achieved cross platform deployment flexibility, rapid development integrated software package.
Availability and implementation
BGDMdocker is available at
http://42.96.173.25/bapgd/
and the source code under GPL license is available at
https://github.com/cgwyx/debian_prokka_panx_antismash_biodocker
.
Contact
chenggongwyx@foxmail.com
Supplementary information
Supplementary data are available at biorxiv online.
Title: BGDMdocker: a Docker workflow for analysis and visualization pan-genome and biosynthetic gene clusters of bacterial
Description:
ABSTRACT
Motivation
At present Docker technology has received increasing level of attention throughout the bioinformatics community.
However, its implementation details have not yet been mastered by most biologists and applied widely in biological researches.
In order to popularizing this technology in the bioinformatics and sufficiently use plenty of public resources of bioinformatics tools (Dockerfile and image of scommunity, officially and privately) in Docker Hub Registry and other Docker sources based on Docker, we introduced full and accurate instance of a bioinformatics workflow based on Docker to analyse and visualize pan-genome and biosynthetic gene clusters of a bacteria in this article, provided the solutions for mining bioinformatics big data from various public biology databases.
You could be guided step-by-step through the workflow process from docker file to build up your own images and run an container fast creating an workflow.
Results
We presented a BGDMdocker (bacterial genome data mining docker-based) workflow based on docker.
The workflow consists of three integrated toolkits, Prokka v1.
11, panX, and antiSMASH3.
The dependencies were all written in Dockerfile, to build docker image and run container for analysing pan-genome of total 44
Bacillus amyloliquefaciens
strains, which were retrieved from public? database.
The pan-genome totally includes 172,432 gene, 2,306 Core gene cluster.
The visualized pan-genomic data such as alignment, phylogenetic trees, maps mutations within that cluster to the branches of the tree, infers loss and gain of genes on the core-genome phylogeny for each gene cluster were presented.
Besides, 997 known (MIBiG database) and 553 unknown (antiSMASH-predicted clusters and Pfam database) genes of biosynthesis gene clusters types and orthologous groups were mined in all strains.
This workflow could also be used for other species pan-genome analysis and visualization.
The display of visual data can completely duplicated as well as done in this paper.
All result data and relevant tools and files can be downloaded from our website with no need to register.
The pan-genome and biosynthetic gene clusters analysis and visualization can be fully reusable immediately in different computing platforms (Linux, Windows, Mac and deployed in the cloud), achieved cross platform deployment flexibility, rapid development integrated software package.
Availability and implementation
BGDMdocker is available at
http://42.
96.
173.
25/bapgd/
and the source code under GPL license is available at
https://github.
com/cgwyx/debian_prokka_panx_antismash_biodocker
.
Contact
chenggongwyx@foxmail.
com
Supplementary information
Supplementary data are available at biorxiv online.
Related Results
Are there conserved biosynthetic genes in lichens? Genome-wide assessment of terpene biosynthetic genes suggests conserved evolution of the squalene synthase cluster
Are there conserved biosynthetic genes in lichens? Genome-wide assessment of terpene biosynthetic genes suggests conserved evolution of the squalene synthase cluster
Abstract
Lichen-forming fungi (LFF) are prolific producers of functionally and structurally diverse secondary metabolites, most of which are taxonomically exclusive deliver...
Critical assessment of pan-genomics of metagenome-assembled genomes
Critical assessment of pan-genomics of metagenome-assembled genomes
AbstractBackgroundLarge scale metagenome assembly and binning to generate metagenome-assembled genomes (MAGs) has become possible in the past five years. As a result, millions of M...
CREDO: a friendly Customizable, REproducible, DOcker file generator for bioinformatics applications
CREDO: a friendly Customizable, REproducible, DOcker file generator for bioinformatics applications
Abstract
Background
The analysis of large and complex biological datasets in bioinformatics poses a significant challenge to achieving reproducible ...
Automating Assessment of the Undiscovered Biosynthetic Potential of Actinobacteria
Automating Assessment of the Undiscovered Biosynthetic Potential of Actinobacteria
Background. Biosynthetic potential of Actinobacteria has long been the subject of theoretical estimates. Such an estimate is indeed important as a test of further exploitability of...
Are there conserved biosynthetic genes in lichens? Genome-wide assessment of terpene biosynthetic genes suggests ubiquitous distribution of the squalene synthase cluster
Are there conserved biosynthetic genes in lichens? Genome-wide assessment of terpene biosynthetic genes suggests ubiquitous distribution of the squalene synthase cluster
Abstract
Lichen-forming fungi (LFF) are prolific producers of functionally and structurally diverse secondary metabolites, most of which are taxonomically exclusi...
Docker File Untuk Publish Storage Digital dengan Nextcloud pada PT. Telkom Akses
Docker File Untuk Publish Storage Digital dengan Nextcloud pada PT. Telkom Akses
Transformasi digital yang cepat dalam industri telekomunikasi mendorong PT. Telkom Akses untuk mengadopsi teknologi kontainerisasi guna meningkatkan efisiensi dan skalabilitas infr...
Expression and polymorphism of genes in gallstones
Expression and polymorphism of genes in gallstones
ABSTRACT
Through the method of clinical case control study, to explore the expression and genetic polymorphism of KLF14 gene (rs4731702 and rs972283) and SR-B1 gene...
Perturbation of GABA Biosynthesis Links Cell Cycle to ControlArabidopsis thalianaLeaf Development
Perturbation of GABA Biosynthesis Links Cell Cycle to ControlArabidopsis thalianaLeaf Development
AbstractTo investigate the molecular mechanism underlying increasing leaf area in γ-Aminobutyric acid (GABA) biosynthetic mutants, the first pair of true leaves of GABA biosyntheti...

