Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

BGDMdocker: a Docker workflow for analysis and visualization pan-genome and biosynthetic gene clusters of bacterial

View through CrossRef
ABSTRACT Motivation At present Docker technology has received increasing level of attention throughout the bioinformatics community. However, its implementation details have not yet been mastered by most biologists and applied widely in biological researches. In order to popularizing this technology in the bioinformatics and sufficiently use plenty of public resources of bioinformatics tools (Dockerfile and image of scommunity, officially and privately) in Docker Hub Registry and other Docker sources based on Docker, we introduced full and accurate instance of a bioinformatics workflow based on Docker to analyse and visualize pan-genome and biosynthetic gene clusters of a bacteria in this article, provided the solutions for mining bioinformatics big data from various public biology databases. You could be guided step-by-step through the workflow process from docker file to build up your own images and run an container fast creating an workflow. Results We presented a BGDMdocker (bacterial genome data mining docker-based) workflow based on docker. The workflow consists of three integrated toolkits, Prokka v1.11, panX, and antiSMASH3.0. The dependencies were all written in Dockerfile, to build docker image and run container for analysing pan-genome of total 44 Bacillus amyloliquefaciens strains, which were retrieved from public? database. The pan-genome totally includes 172,432 gene, 2,306 Core gene cluster. The visualized pan-genomic data such as alignment, phylogenetic trees, maps mutations within that cluster to the branches of the tree, infers loss and gain of genes on the core-genome phylogeny for each gene cluster were presented. Besides, 997 known (MIBiG database) and 553 unknown (antiSMASH-predicted clusters and Pfam database) genes of biosynthesis gene clusters types and orthologous groups were mined in all strains. This workflow could also be used for other species pan-genome analysis and visualization. The display of visual data can completely duplicated as well as done in this paper. All result data and relevant tools and files can be downloaded from our website with no need to register. The pan-genome and biosynthetic gene clusters analysis and visualization can be fully reusable immediately in different computing platforms (Linux, Windows, Mac and deployed in the cloud), achieved cross platform deployment flexibility, rapid development integrated software package. Availability and implementation BGDMdocker is available at http://42.96.173.25/bapgd/ and the source code under GPL license is available at https://github.com/cgwyx/debian_prokka_panx_antismash_biodocker . Contact chenggongwyx@foxmail.com Supplementary information Supplementary data are available at biorxiv online.
Title: BGDMdocker: a Docker workflow for analysis and visualization pan-genome and biosynthetic gene clusters of bacterial
Description:
ABSTRACT Motivation At present Docker technology has received increasing level of attention throughout the bioinformatics community.
However, its implementation details have not yet been mastered by most biologists and applied widely in biological researches.
In order to popularizing this technology in the bioinformatics and sufficiently use plenty of public resources of bioinformatics tools (Dockerfile and image of scommunity, officially and privately) in Docker Hub Registry and other Docker sources based on Docker, we introduced full and accurate instance of a bioinformatics workflow based on Docker to analyse and visualize pan-genome and biosynthetic gene clusters of a bacteria in this article, provided the solutions for mining bioinformatics big data from various public biology databases.
You could be guided step-by-step through the workflow process from docker file to build up your own images and run an container fast creating an workflow.
Results We presented a BGDMdocker (bacterial genome data mining docker-based) workflow based on docker.
The workflow consists of three integrated toolkits, Prokka v1.
11, panX, and antiSMASH3.
The dependencies were all written in Dockerfile, to build docker image and run container for analysing pan-genome of total 44 Bacillus amyloliquefaciens strains, which were retrieved from public? database.
The pan-genome totally includes 172,432 gene, 2,306 Core gene cluster.
The visualized pan-genomic data such as alignment, phylogenetic trees, maps mutations within that cluster to the branches of the tree, infers loss and gain of genes on the core-genome phylogeny for each gene cluster were presented.
Besides, 997 known (MIBiG database) and 553 unknown (antiSMASH-predicted clusters and Pfam database) genes of biosynthesis gene clusters types and orthologous groups were mined in all strains.
This workflow could also be used for other species pan-genome analysis and visualization.
The display of visual data can completely duplicated as well as done in this paper.
All result data and relevant tools and files can be downloaded from our website with no need to register.
The pan-genome and biosynthetic gene clusters analysis and visualization can be fully reusable immediately in different computing platforms (Linux, Windows, Mac and deployed in the cloud), achieved cross platform deployment flexibility, rapid development integrated software package.
Availability and implementation BGDMdocker is available at http://42.
96.
173.
25/bapgd/ and the source code under GPL license is available at https://github.
com/cgwyx/debian_prokka_panx_antismash_biodocker .
Contact chenggongwyx@foxmail.
com Supplementary information Supplementary data are available at biorxiv online.

Related Results

Critical assessment of pan-genomics of metagenome-assembled genomes
Critical assessment of pan-genomics of metagenome-assembled genomes
AbstractBackgroundLarge scale metagenome assembly and binning to generate metagenome-assembled genomes (MAGs) has become possible in the past five years. As a result, millions of M...
CREDO: a friendly Customizable, REproducible, DOcker file generator for bioinformatics applications
CREDO: a friendly Customizable, REproducible, DOcker file generator for bioinformatics applications
Abstract Background The analysis of large and complex biological datasets in bioinformatics poses a significant challenge to achieving reproducible ...
Automating Assessment of the Undiscovered Biosynthetic Potential of Actinobacteria
Automating Assessment of the Undiscovered Biosynthetic Potential of Actinobacteria
Background. Biosynthetic potential of Actinobacteria has long been the subject of theoretical estimates. Such an estimate is indeed important as a test of further exploitability of...
Docker File Untuk Publish Storage Digital dengan Nextcloud pada PT. Telkom Akses
Docker File Untuk Publish Storage Digital dengan Nextcloud pada PT. Telkom Akses
Transformasi digital yang cepat dalam industri telekomunikasi mendorong PT. Telkom Akses untuk mengadopsi teknologi kontainerisasi guna meningkatkan efisiensi dan skalabilitas infr...
Expression and polymorphism of genes in gallstones
Expression and polymorphism of genes in gallstones
ABSTRACT Through the method of clinical case control study, to explore the expression and genetic polymorphism of KLF14 gene (rs4731702 and rs972283) and SR-B1 gene...
Perturbation of GABA Biosynthesis Links Cell Cycle to ControlArabidopsis thalianaLeaf Development
Perturbation of GABA Biosynthesis Links Cell Cycle to ControlArabidopsis thalianaLeaf Development
AbstractTo investigate the molecular mechanism underlying increasing leaf area in γ-Aminobutyric acid (GABA) biosynthetic mutants, the first pair of true leaves of GABA biosyntheti...

Back to Top