Javascript must be enabled to continue!
Barcodes for genomes and applications
View through CrossRef
Abstract
Background
Each genome has a stable distribution of the combined frequency for each k-mer and its reverse complement measured in sequence fragments as short as 1000 bps across the whole genome, for 1<k<6. The collection of these k-mer frequency distributions is unique to each genome and termed the genome's barcode.
Results
We found that for each genome, the majority of its short sequence fragments have highly similar barcodes while sequence fragments with different barcodes typically correspond to genes that are horizontally transferred or highly expressed. This observation has led to new and more effective ways for addressing two challenging problems: metagenome binning problem and identification of horizontally transferred genes. Our barcode-based metagenome binning algorithm substantially improves the state of the art in terms of both binning accuracies and the scope of applicability. Other attractive properties of genomes barcodes include (a) the barcodes have different and identifiable characteristics for different classes of genomes like prokaryotes, eukaryotes, mitochondria and plastids, and (b) barcodes similarities are generally proportional to the genomes' phylogenetic closeness.
Conclusion
These and other properties of genomes barcodes make them a new and effective tool for studying numerous genome and metagenome analysis problems.
Title: Barcodes for genomes and applications
Description:
Abstract
Background
Each genome has a stable distribution of the combined frequency for each k-mer and its reverse complement measured in sequence fragments as short as 1000 bps across the whole genome, for 1<k<6.
The collection of these k-mer frequency distributions is unique to each genome and termed the genome's barcode.
Results
We found that for each genome, the majority of its short sequence fragments have highly similar barcodes while sequence fragments with different barcodes typically correspond to genes that are horizontally transferred or highly expressed.
This observation has led to new and more effective ways for addressing two challenging problems: metagenome binning problem and identification of horizontally transferred genes.
Our barcode-based metagenome binning algorithm substantially improves the state of the art in terms of both binning accuracies and the scope of applicability.
Other attractive properties of genomes barcodes include (a) the barcodes have different and identifiable characteristics for different classes of genomes like prokaryotes, eukaryotes, mitochondria and plastids, and (b) barcodes similarities are generally proportional to the genomes' phylogenetic closeness.
Conclusion
These and other properties of genomes barcodes make them a new and effective tool for studying numerous genome and metagenome analysis problems.
Related Results
Comparison of Objects’ Images based on Computational Topology Methods
Comparison of Objects’ Images based on Computational Topology Methods
The paper considers methods for comparison of objects’ images represented by sets of points using computational topology methods. The algorithms for construction of sets of real ba...
How chromosomal rearrangements shape genomes : a computational and mathematical study
How chromosomal rearrangements shape genomes : a computational and mathematical study
Comment les réarrangements chromosomiques façonnent les génomes : étude par modélisation et simulations
Les origines de la complexité des génomes, ainsi que les dét...
Genomic characterization of the
C. tuberculostearicum
species complex, a ubiquitous member of the human skin microbiome
Genomic characterization of the
C. tuberculostearicum
species complex, a ubiquitous member of the human skin microbiome
ABSTRACT
Corynebacterium
is a predominant genus in the skin microbiome, yet its genetic diversity on skin is incompletely chara...
Statistique des comparaisons de génomes complets bactériens
Statistique des comparaisons de génomes complets bactériens
La génomique comparative est l'étude des relations structurales et fonctionnelles entre des génomes appartenant à différentes souches ou espèces. Cette discipline offre ainsi la po...
Pheniqs 2.0: accurate, high performance Bayesian decoding and confidence estimation for combinatorial barcode indexing
Pheniqs 2.0: accurate, high performance Bayesian decoding and confidence estimation for combinatorial barcode indexing
AbstractBackgroundSystems biology increasingly relies on deep sequencing with combinatorial index tags to associate biological sequences with their sample, cell, or molecule of ori...
Human Brain Barcodes
Human Brain Barcodes
Abstract
Dynamic CpG methylation “barcodes” were read from 15,000 to 21,000 single cells from three human male brains. To overcome sparse sequencing coverage, the b...
Human Brain Ancestral Barcodes
Human Brain Ancestral Barcodes
Abstract
Dynamic CpG methylation “barcodes” were read from 15,000 to 21,000 single cells from three human male brains. To overcome sparse sequencing coverage, the b...
Human Brain Ancestral Barcodes
Human Brain Ancestral Barcodes
Abstract
Dynamic CpG methylation “barcodes” were read from 15,000 to 21,000 single cells from three human male brains. To overcome sparse sequencing coverage, the b...

