Javascript must be enabled to continue!
Sequence Alignment on Directed Graphs
View through CrossRef
Abstract
Genomic variations in a reference collection are naturally represented as genome variation graphs. Such graphs encode common subsequences as vertices and the variations are captured using additional vertices and directed edges. The resulting graphs are directed graphs possibly with cycles. Existing algorithms for aligning sequences on such graphs make use of partial order alignment (POA) techniques that work on directed acyclic graphs (DAG). For this, acyclic extensions of the input graphs are first constructed through expensive loop unrolling steps (DAGification). Also, such graph extensions could have considerable blow up in their size and in the worst case the blow up factor is proportional to the input sequence length. We provide a novel alignment algorithm V-ALIGN that aligns the input sequence directly on the input graph while avoiding such expensive DAGification steps. V-ALIGN is based on a novel dynamic programming formulation that allows gapped alignment directly on the input graph. It supports affine and linear gaps. We also propose refinements to V-ALIGN for better performance in practice. In this, the time to fill the DP table has linear dependence on the sizes of the sequence, the graph and its feedback vertex set. We perform experiments to compare against the POA based alignment. For aligning short sequences, standard approaches restrict the expensive gapped alignment to small filtered subgraphs having high ‘similarity’ to the input sequence. In such cases, the performance of V-ALIGN for gapped alignment on the filtered subgraph depends on the subgraph sizes.
Title: Sequence Alignment on Directed Graphs
Description:
Abstract
Genomic variations in a reference collection are naturally represented as genome variation graphs.
Such graphs encode common subsequences as vertices and the variations are captured using additional vertices and directed edges.
The resulting graphs are directed graphs possibly with cycles.
Existing algorithms for aligning sequences on such graphs make use of partial order alignment (POA) techniques that work on directed acyclic graphs (DAG).
For this, acyclic extensions of the input graphs are first constructed through expensive loop unrolling steps (DAGification).
Also, such graph extensions could have considerable blow up in their size and in the worst case the blow up factor is proportional to the input sequence length.
We provide a novel alignment algorithm V-ALIGN that aligns the input sequence directly on the input graph while avoiding such expensive DAGification steps.
V-ALIGN is based on a novel dynamic programming formulation that allows gapped alignment directly on the input graph.
It supports affine and linear gaps.
We also propose refinements to V-ALIGN for better performance in practice.
In this, the time to fill the DP table has linear dependence on the sizes of the sequence, the graph and its feedback vertex set.
We perform experiments to compare against the POA based alignment.
For aligning short sequences, standard approaches restrict the expensive gapped alignment to small filtered subgraphs having high ‘similarity’ to the input sequence.
In such cases, the performance of V-ALIGN for gapped alignment on the filtered subgraph depends on the subgraph sizes.
Related Results
Independent Set in Neutrosophic Graphs
Independent Set in Neutrosophic Graphs
New setting is introduced to study neutrosophic independent number and independent neutrosophic-number arising neighborhood of different vertices. Neighbor is a key term to have th...
Failed Independent Number in Neutrosophic Graphs
Failed Independent Number in Neutrosophic Graphs
New setting is introduced to study neutrosophic failed-independent number and failed independent neutrosophic-number arising neighborhood of different vertices. Neighbor is a key t...
Computing a Minimum Subset Feedback Vertex Set on Chordal Graphs Parameterized by Leafage
Computing a Minimum Subset Feedback Vertex Set on Chordal Graphs Parameterized by Leafage
Abstract
Chordal graphs are characterized as the intersection graphs of subtrees in a tree and such a representation is known as the tree model. Restricting the characteriz...
Multiple sequence alignment accuracy and evolutionary distance estimation
Multiple sequence alignment accuracy and evolutionary distance estimation
Abstract
Background
Sequence alignment is a common tool in bioinformatics and comparative genomics. It is generally assumed that multiple sequence a...
On the reciprocal distance spectrum of edge corona of graphs
On the reciprocal distance spectrum of edge corona of graphs
The reciprocal distance spectrum (Harary spectrum) of a connected graph [Formula: see text] is the multiset of eigenvalues of its reciprocal distance matrix (Harary matrix) [Formul...
Ancestral sequence alignment under optimal conditions
Ancestral sequence alignment under optimal conditions
Abstract
Background
Multiple genome alignment is an important problem in bioinformatics. An important subproblem used by many multiple alignment app...
Computing the Energy of Certain Graphs based on Vertex Status
Computing the Energy of Certain Graphs based on Vertex Status
Background:
The concept of Hückel molecular orbital theory is used to compute the graph energy numerically and graphically on the base of the status of a vertex.
Objective:
Our a...
Data Analytics on Graphs Part I: Graphs and Spectra on Graphs
Data Analytics on Graphs Part I: Graphs and Spectra on Graphs
The area of Data Analytics on graphs promises a paradigm shift, as we approach information processing of new classes of data which are typically acquired on irregular but structure...

