Javascript must be enabled to continue!
Genome Rearrangement Distance with Reversals, Transpositions, and Indels
View through CrossRef
The rearrangement distance is a well-known problem in the field of comparative genomics. Given two genomes, the rearrangement distance is the minimum number of rearrangements in a set of allowed rearrangements (rearrangement model), which transforms one genome into the other. In rearrangement distance problems, a genome is modeled as a string, where each element represents a conserved region within the two genomes. When the orientation of the genes is known, it is represented by (plus or minus) signs assigned to the elements of the string. Two of the most studied rearrangements are reversals, which invert a segment of the genome, and transpositions, which exchange the relative positions of two adjacent segments of the genome. The first works in genome rearrangements considered that the genomes being compared had the same genetic material and that rearrangement events were restricted to reversals, transpositions, or both. El-Mabrouk extended the reversal model on signed strings to include the operations of insertion and deletion of segments in the genome, which allowed the comparison of genomes with different genetic material. Other studies also addressed this problem and, recently, this problem was proved to be solvable in polynomial time by Willing et al. For unsigned strings, we still observe a lack of results. That said, in this study we prove that computing the rearrangement distance for the following models is NP-Hard: reversals and indels on unsigned strings; transpositions and indels on unsigned strings; and reversals, transpositions, and indels on signed and unsigned strings. Along with the NP-hardness proofs, we present a 2-approximation algorithm for reversals on unsigned strings and 3-approximation algorithms for the other models.
Title: Genome Rearrangement Distance with Reversals, Transpositions, and Indels
Description:
The rearrangement distance is a well-known problem in the field of comparative genomics.
Given two genomes, the rearrangement distance is the minimum number of rearrangements in a set of allowed rearrangements (rearrangement model), which transforms one genome into the other.
In rearrangement distance problems, a genome is modeled as a string, where each element represents a conserved region within the two genomes.
When the orientation of the genes is known, it is represented by (plus or minus) signs assigned to the elements of the string.
Two of the most studied rearrangements are reversals, which invert a segment of the genome, and transpositions, which exchange the relative positions of two adjacent segments of the genome.
The first works in genome rearrangements considered that the genomes being compared had the same genetic material and that rearrangement events were restricted to reversals, transpositions, or both.
El-Mabrouk extended the reversal model on signed strings to include the operations of insertion and deletion of segments in the genome, which allowed the comparison of genomes with different genetic material.
Other studies also addressed this problem and, recently, this problem was proved to be solvable in polynomial time by Willing et al.
For unsigned strings, we still observe a lack of results.
That said, in this study we prove that computing the rearrangement distance for the following models is NP-Hard: reversals and indels on unsigned strings; transpositions and indels on unsigned strings; and reversals, transpositions, and indels on signed and unsigned strings.
Along with the NP-hardness proofs, we present a 2-approximation algorithm for reversals on unsigned strings and 3-approximation algorithms for the other models.
Related Results
Incorporating intergenic regions into reversal and transposition distances with indels
Incorporating intergenic regions into reversal and transposition distances with indels
Problems in the genome rearrangement field are often formulated in terms of pairwise genome comparison: given two genomes [Formula: see text] and [Formula: see text], find the mini...
On the Complexity of Some Variations of Sorting by Transpositions
On the Complexity of Some Variations of Sorting by Transpositions
One of the main challenges in Computational Biology is to find the evolutionary distance between two organisms. In the field of comparative genomics, one way to estimate such dista...
Phylogenetic reconstruction from transpositions
Phylogenetic reconstruction from transpositions
Abstract
Background
Because of the advent of high-throughput sequencing and the consequent reduction in the cost of sequencing, many organisms ha...
BIOM-12. ULTRA-DEEP WHOLE GENOME SEQUENCING UNCOVERS A HIGH PREVALENCE OF GLIOBLASTOMA-DERIVED CELL-FREE DNA IN PLASMA
BIOM-12. ULTRA-DEEP WHOLE GENOME SEQUENCING UNCOVERS A HIGH PREVALENCE OF GLIOBLASTOMA-DERIVED CELL-FREE DNA IN PLASMA
Abstract
The detection of glioblastoma-derived circulating cell-free DNA (ccfDNA) has proven formidable because tumor-specific mutations are hidden among the abundan...
Detecting Large Indels Using Optical Map Data
Detecting Large Indels Using Optical Map Data
Abstract
Optical Maps (OM) provide reads that are very long, and thus can be used to detect large indels not detectable by the shorter reads prov...
Extended Duration of DH–JH Rearrangement in Immunoglobulin Heavy Chain Transgenic Mice: Implications for Regulation of Allelic Exclusion
Extended Duration of DH–JH Rearrangement in Immunoglobulin Heavy Chain Transgenic Mice: Implications for Regulation of Allelic Exclusion
Here we show that suppression of VH–DJH rearrangement in mice bearing a μ heavy (H) chain transgene (μ-tg mice) is associated with an extended period of DH–JH rearrangement, the fi...
Whole Genome Resequencing and 1000 Genomes Project
Whole Genome Resequencing and 1000 Genomes Project
Abstract
The recent advances in sequencing technologies have enabled the whole human genome to be sequenced within weeks. To date, several human...
Gamma-rays induced genome wide stable mutations in cowpea deciphered through whole genome sequencing
Gamma-rays induced genome wide stable mutations in cowpea deciphered through whole genome sequencing
Abstract
Gamma-rays are the most widely exploited physical mutagen in plant mutation breeding. They are known to be involved in development of more than 60% of global cowpe...

