Javascript must be enabled to continue!

Genome Rearrangement Distance with Reversals, Transpositions, and Indels

The rearrangement distance is a well-known problem in the field of comparative genomics. Given two genomes, the rearrangement distance is the minimum number of rearrangements in a set of allowed rearrangements (rearrangement model), which transforms one genome into the other. In rearrangement distance problems, a genome is modeled as a string, where each element represents a conserved region within the two genomes. When the orientation of the genes is known, it is represented by (plus or minus) signs assigned to the elements of the string. Two of the most studied rearrangements are reversals, which invert a segment of the genome, and transpositions, which exchange the relative positions of two adjacent segments of the genome. The first works in genome rearrangements considered that the genomes being compared had the same genetic material and that rearrangement events were restricted to reversals, transpositions, or both. El-Mabrouk extended the reversal model on signed strings to include the operations of insertion and deletion of segments in the genome, which allowed the comparison of genomes with different genetic material. Other studies also addressed this problem and, recently, this problem was proved to be solvable in polynomial time by Willing et al. For unsigned strings, we still observe a lack of results. That said, in this study we prove that computing the rearrangement distance for the following models is NP-Hard: reversals and indels on unsigned strings; transpositions and indels on unsigned strings; and reversals, transpositions, and indels on signed and unsigned strings. Along with the NP-hardness proofs, we present a 2-approximation algorithm for reversals on unsigned strings and 3-approximation algorithms for the other models.

SAGE Publications

Alexsandro Oliveira Alexandrino Andre Rodrigues Oliveira Ulisses Dias Zanoni Dias

Journal of Computational Biology

2020

Title: Genome Rearrangement Distance with Reversals, Transpositions, and Indels

Description:

The rearrangement distance is a well-known problem in the field of comparative genomics.

Given two genomes, the rearrangement distance is the minimum number of rearrangements in a set of allowed rearrangements (rearrangement model), which transforms one genome into the other.

In rearrangement distance problems, a genome is modeled as a string, where each element represents a conserved region within the two genomes.

When the orientation of the genes is known, it is represented by (plus or minus) signs assigned to the elements of the string.

Two of the most studied rearrangements are reversals, which invert a segment of the genome, and transpositions, which exchange the relative positions of two adjacent segments of the genome.

The first works in genome rearrangements considered that the genomes being compared had the same genetic material and that rearrangement events were restricted to reversals, transpositions, or both.

El-Mabrouk extended the reversal model on signed strings to include the operations of insertion and deletion of segments in the genome, which allowed the comparison of genomes with different genetic material.

Other studies also addressed this problem and, recently, this problem was proved to be solvable in polynomial time by Willing et al.

For unsigned strings, we still observe a lack of results.

That said, in this study we prove that computing the rearrangement distance for the following models is NP-Hard: reversals and indels on unsigned strings; transpositions and indels on unsigned strings; and reversals, transpositions, and indels on signed and unsigned strings.

Along with the NP-hardness proofs, we present a 2-approximation algorithm for reversals on unsigned strings and 3-approximation algorithms for the other models.

Back

Problems in the genome rearrangement field are often formulated in terms of pairwise genome comparison: given two genomes [Formula: see text] and [Formula: see text], find the mini...

On the Complexity of Some Variations of Sorting by Transpositions

One of the main challenges in Computational Biology is to find the evolutionary distance between two organisms. In the field of comparative genomics, one way to estimate such dista...

Phylogenetic reconstruction from transpositions

Abstract Background Because of the advent of high-throughput sequencing and the consequent reduction in the cost of sequencing, many organisms ha...

BIOM-12. ULTRA-DEEP WHOLE GENOME SEQUENCING UNCOVERS A HIGH PREVALENCE OF GLIOBLASTOMA-DERIVED CELL-FREE DNA IN PLASMA

Abstract The detection of glioblastoma-derived circulating cell-free DNA (ccfDNA) has proven formidable because tumor-specific mutations are hidden among the abundan...

Detecting Large Indels Using Optical Map Data

Abstract Optical Maps (OM) provide reads that are very long, and thus can be used to detect large indels not detectable by the shorter reads prov...

Extended Duration of DH–JH Rearrangement in Immunoglobulin Heavy Chain Transgenic Mice: Implications for Regulation of Allelic Exclusion

Here we show that suppression of VH–DJH rearrangement in mice bearing a μ heavy (H) chain transgene (μ-tg mice) is associated with an extended period of DH–JH rearrangement, the fi...

Whole Genome Resequencing and 1000 Genomes Project

Abstract The recent advances in sequencing technologies have enabled the whole human genome to be sequenced within weeks. To date, several human...

Gamma-rays induced genome wide stable mutations in cowpea deciphered through whole genome sequencing

Abstract Gamma-rays are the most widely exploited physical mutagen in plant mutation breeding. They are known to be involved in development of more than 60% of global cowpe...

Email:
Password:

Email:

Genome Rearrangement Distance with Reversals, Transpositions, and Indels

Related Results