Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Estimation of redundancy in microbial genomes

View through CrossRef
Abstract Background Microbial genomes vary considerably both with respect to size and base composition. While the smallest genomes have less than 200,000 base pairs, or nucleotides, others can consist of millions. The same is true for genomic base composition, often summarized as genomic AT or GC content due to the similar frequencies of (A)denine and (T)hymine on one hand and (C)ytosine and (G)uanine on the other; the most extreme microbes can have genomes with AT content below 25% or above 85%. Genomic AT content influences the frequency of DNA words, or oligonucleotides, consisting of multiple nucleotides. Here we explore to what extent genome size, AT/GC content and genomic oligonucleotide usage variance (OUV) are linked to microbial genome redundancy, or compression rate, as measured using both a DNA based- (MBGC) and a general purpose (ZPAQ) compression algorithm on 4,713 RefSeq genomes. Results We find that genome size (p < 0.001) and OUV (p < 0.001) are both strongly associated with genome redundancy for both types of file compressors. The DNA based MBGC compressor managed to improve compression with approximately 3% on average with respect to ZPAQ. Moreover, MBGC detected a significant (p < 0.001) compression ratio difference between AT poor and AT rich genomes that was not detected with ZPAQ. Conclusion As lack of compressibility is equivalent to the presence of randomness, our findings suggest that small and AT rich genomes may have accumulated more random mutations on average than larger and AT poor/GC rich genomes, which, in turn, were significantly more redundant. Moreover, we find that OUV is a strong proxy for genome compressibility in microbial genomes. The ZPAQ compressor was found to agree with the MBGC compressor, albeit with a poorer performance, except for the compressibility of AT-rich and AT-poor genomes.
Springer Science and Business Media LLC
Title: Estimation of redundancy in microbial genomes
Description:
Abstract Background Microbial genomes vary considerably both with respect to size and base composition.
While the smallest genomes have less than 200,000 base pairs, or nucleotides, others can consist of millions.
The same is true for genomic base composition, often summarized as genomic AT or GC content due to the similar frequencies of (A)denine and (T)hymine on one hand and (C)ytosine and (G)uanine on the other; the most extreme microbes can have genomes with AT content below 25% or above 85%.
Genomic AT content influences the frequency of DNA words, or oligonucleotides, consisting of multiple nucleotides.
Here we explore to what extent genome size, AT/GC content and genomic oligonucleotide usage variance (OUV) are linked to microbial genome redundancy, or compression rate, as measured using both a DNA based- (MBGC) and a general purpose (ZPAQ) compression algorithm on 4,713 RefSeq genomes.
Results We find that genome size (p < 0.
001) and OUV (p < 0.
001) are both strongly associated with genome redundancy for both types of file compressors.
The DNA based MBGC compressor managed to improve compression with approximately 3% on average with respect to ZPAQ.
Moreover, MBGC detected a significant (p < 0.
001) compression ratio difference between AT poor and AT rich genomes that was not detected with ZPAQ.
Conclusion As lack of compressibility is equivalent to the presence of randomness, our findings suggest that small and AT rich genomes may have accumulated more random mutations on average than larger and AT poor/GC rich genomes, which, in turn, were significantly more redundant.
Moreover, we find that OUV is a strong proxy for genome compressibility in microbial genomes.
The ZPAQ compressor was found to agree with the MBGC compressor, albeit with a poorer performance, except for the compressibility of AT-rich and AT-poor genomes.

Related Results

Using set theory to reduce redundancy in pathway sets
Using set theory to reduce redundancy in pathway sets
1.Abstract1.01BackgroundThe consolidation of pathway databases, such as KEGG[1], Reactome[2]and ConsensusPathDB[3], has generated widespread biological interest, however the issue ...
Soil microbial relative resource limitation exhibited contrasting seasonal patterns along an elevational gradient in Yulong Snow Mountain
Soil microbial relative resource limitation exhibited contrasting seasonal patterns along an elevational gradient in Yulong Snow Mountain
Abstract Microbial relative resource limitations represented by enzyme stoichiometry reflect the relationship between microbial nutrient requirements and nutrient status in soil,...
Effects of Neonicotinoid Seed Treatments on Soil Microbial Gene Expression Vary with Time in an Agricultural Ecosystem
Effects of Neonicotinoid Seed Treatments on Soil Microbial Gene Expression Vary with Time in an Agricultural Ecosystem
ABSTRACTNeonicotinoids, a class of systemic insecticides, have been widely used for decades against various insect pests. Past studies have reported non-target effects of neonicoti...
A QUAD CMOS GATES CHECKING METHOD
A QUAD CMOS GATES CHECKING METHOD
The so-called Fault-Tolerant Systems (FTS) use the structural, temporal, functional, or information redundancy for the achievement of the high reliability. For example, Radiation H...
Hyper redundancy for super reliable FPGAs
Hyper redundancy for super reliable FPGAs
The subject of the research presented in the article is hyper-redundant elements and FPGA devices that can be used in highly reliable digital systems (HRDS). The current work devel...
A systematic comparison of eight new plastome sequences from Ipomoea L
A systematic comparison of eight new plastome sequences from Ipomoea L
Background Ipomoea is the largest genus in the family Convolvulaceae. The species in this genus have been widely used in many fields, such as agriculture, nutrition, and medicine. ...
STRUKTUR KOMUNITAS MIKROBA TANAH DAN IMPLIKASINYA DALAM MEWUJUDKAN SISTEM PERTANIAN BERKELANJUTAN
STRUKTUR KOMUNITAS MIKROBA TANAH DAN IMPLIKASINYA DALAM MEWUJUDKAN SISTEM PERTANIAN BERKELANJUTAN
Soils are made up of organic and an organic material. The organic soil component contains all the living creatures in the soil and the dead ones in various stages of decomposition....
A Compact Versatile Microbial Fuel Cell From Paper
A Compact Versatile Microbial Fuel Cell From Paper
Microbial fuel cells (MFCs) have been a potential green energy source for a long time but one of the problems is that either the technology must be used on a large scale or special...

Back to Top