Javascript must be enabled to continue!
GIN-TONIC: Non-hierarchical full-text indexing for graph-genomes
View through CrossRef
Abstract
This paper presents a new data structure, GIN-TONIC, designed to index arbitrary string-labelled directed graphs representing, for instance, pangenomes or transcriptomes. GIN-TONIC provides several capabilities not offered by other graph-indexing methods based on the FM-index. It is non-hierarchical, handling a graph as a single monolithic object; it indexes at nucleotide resolution all possible walks in the graph without the need to explicitly store them; it supports exact substring queries in polynomial time and space for all possible walk roots in the graph, even if there are exponentially many walks corresponding to such roots. Specific ad-hoc optimisations, such as a precomputed cache, allow GIN-TONIC to achieve excellent performance for input graphs of various topologies and sizes. Robust scalability capabilities and a querying performance close to that of a linear FM-Index are demonstrated for two real-world applications, a human pangenome and transcriptome. Source code and associated benchmarks are available on GitHub.
Availability and implementation
GIN-TONIC and all related programs are available at
https://github.com/uensalo/gin
.
Title: GIN-TONIC: Non-hierarchical full-text indexing for graph-genomes
Description:
Abstract
This paper presents a new data structure, GIN-TONIC, designed to index arbitrary string-labelled directed graphs representing, for instance, pangenomes or transcriptomes.
GIN-TONIC provides several capabilities not offered by other graph-indexing methods based on the FM-index.
It is non-hierarchical, handling a graph as a single monolithic object; it indexes at nucleotide resolution all possible walks in the graph without the need to explicitly store them; it supports exact substring queries in polynomial time and space for all possible walk roots in the graph, even if there are exponentially many walks corresponding to such roots.
Specific ad-hoc optimisations, such as a precomputed cache, allow GIN-TONIC to achieve excellent performance for input graphs of various topologies and sizes.
Robust scalability capabilities and a querying performance close to that of a linear FM-Index are demonstrated for two real-world applications, a human pangenome and transcriptome.
Source code and associated benchmarks are available on GitHub.
Availability and implementation
GIN-TONIC and all related programs are available at
https://github.
com/uensalo/gin
.
Related Results
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
<span style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; ...
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
<span style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; ...
A Review on Indexing Techniques and its application in Multilingual Information Retrieval System
A Review on Indexing Techniques and its application in Multilingual Information Retrieval System
To implement the indexing in multilingual dataset, the indexing process must know. This paper gives the brief about indexing and presents role of indexing, logical view of indexing...
Bounds on the sum of broadcast domination number and strong metric dimension of graphs
Bounds on the sum of broadcast domination number and strong metric dimension of graphs
Let [Formula: see text] be a connected graph of order at least two with vertex set [Formula: see text]. For [Formula: see text], let [Formula: see text] denote the length of an [Fo...
OBITUARY: Agustin Wydia Gunawan
OBITUARY: Agustin Wydia Gunawan
OBITUARY
Agustin Wydia Gunawan ( 21 Agustus 1948 - 15 Maret 2025).
Kolega, sahabat, teman seperjuangan, guru kita, Ir. Agustin Wydia Gunawan MS, biasanya lebih dikenal dengan nama ...
Graph convolutional neural networks for 3D data analysis
Graph convolutional neural networks for 3D data analysis
(English) Deep Learning allows the extraction of complex features directly from raw input data, eliminating the need for hand-crafted features from the classical Machine Learning p...
A saturation problem in meshes
A saturation problem in meshes
Let [Formula: see text] and [Formula: see text] be graphs, where we view [Formula: see text] as the “host” graph and [Formula: see text] as a “forbidden” graph. A spanning subgraph...
The Aα-eigenvalues of the generalized subdivision graph
The Aα-eigenvalues of the generalized subdivision graph
Let [Formula: see text] be a graph with an adjacency matrix [Formula: see text] and a diagonal degree matrix [Formula: see text]. For any graph [Formula: see text] and a real numbe...

