Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Buffering Updates Enables Efficient Dynamic de Bruijn Graphs

View through CrossRef
Abstract Motivation The de Bruijn graph has become a ubiquitous graph model for biological data ever since its initial introduction in the late 1990s. It has been used for a variety of purposes including genome assembly (Zerbino and Birney, 2008; Bankevich et al., 2012; Peng et al., 2012), variant detection (Alipanahi et al., 2020b; Iqbal et al., 2012), and storage of assembled genomes (Chikhi et al., 2016). For this reason, there have been over a dozen methods for building and representing the de Bruijn graph and its variants in a space and time efficient manner. Results With the exception of a few data structures (Muggli et al., 2019; Holley and Melsted, 2020; Crawford et al., 2018), compressed and compact de Bruijn graphs do not allow for the graph to be efficiently updated, meaning that data can be be added or deleted. The most recent compressed dynamic de Bruijn graph (Alipanahi et al., 2020a), relies on dynamic bit vectors which are slow in theory and practice. To address this shortcoming, we present a compressed dynamic de Bruijn graph that removes the necessity of dynamic bit vectors by buffering data that should be added or removed from the graph. We implement our method, which we refer to as BufBOSS, and compare its performance to Bifrost, DynamicBOSS, and FDBG. Our experiments demonstrate that BufBOSS achieves attractive trade-offs compared to other tools in terms of time, memory and disk, and has the best deletion performance by an order of magnitude. Contact alanko.jarno@gmail.com
Title: Buffering Updates Enables Efficient Dynamic de Bruijn Graphs
Description:
Abstract Motivation The de Bruijn graph has become a ubiquitous graph model for biological data ever since its initial introduction in the late 1990s.
It has been used for a variety of purposes including genome assembly (Zerbino and Birney, 2008; Bankevich et al.
, 2012; Peng et al.
, 2012), variant detection (Alipanahi et al.
, 2020b; Iqbal et al.
, 2012), and storage of assembled genomes (Chikhi et al.
, 2016).
For this reason, there have been over a dozen methods for building and representing the de Bruijn graph and its variants in a space and time efficient manner.
Results With the exception of a few data structures (Muggli et al.
, 2019; Holley and Melsted, 2020; Crawford et al.
, 2018), compressed and compact de Bruijn graphs do not allow for the graph to be efficiently updated, meaning that data can be be added or deleted.
The most recent compressed dynamic de Bruijn graph (Alipanahi et al.
, 2020a), relies on dynamic bit vectors which are slow in theory and practice.
To address this shortcoming, we present a compressed dynamic de Bruijn graph that removes the necessity of dynamic bit vectors by buffering data that should be added or removed from the graph.
We implement our method, which we refer to as BufBOSS, and compare its performance to Bifrost, DynamicBOSS, and FDBG.
Our experiments demonstrate that BufBOSS achieves attractive trade-offs compared to other tools in terms of time, memory and disk, and has the best deletion performance by an order of magnitude.
Contact alanko.
jarno@gmail.
com.

Related Results

Multi de Bruijn Sequences and the Cross-Join Method
Multi de Bruijn Sequences and the Cross-Join Method
We show a method to construct binary multi de Bruijn sequences using the cross-join method. We extend the proof given by Alhakim for ordinary de Bruijn sequences to the case of mul...
Phased Multi de Bruijn Sequences
Phased Multi de Bruijn Sequences
We introduce phased multi de Bruijn sequences, a generalization of de Bruijn sequences. A phased string is a string whose positions sequentially rotate through several alphabets; e...
MBG: Minimizer-based Sparse de Bruijn Graph Construction
MBG: Minimizer-based Sparse de Bruijn Graph Construction
Motivation De Bruijn graphs can be constructed from short reads efficiently and have been used for many purposes. Traditionally long read sequencing technologies ...
Building Large Updatable Colored de Bruijn Graphs via Merging
Building Large Updatable Colored de Bruijn Graphs via Merging
MOTIVATION: There exists several massive genomic and metagenomic data collection efforts, including GenomeTrakr and MetaSub, which are routinely updated with new data. To analyze s...
Distribution of Variables in Lambda-Terms with Restrictions on De Bruijn Indices and De Bruijn Levels
Distribution of Variables in Lambda-Terms with Restrictions on De Bruijn Indices and De Bruijn Levels
We consider two special subclasses of lambda-terms that are restricted by a bound on the number of abstractions between a variable and its binding lambda, the so-called De-Bruijn i...
Social buffering as an indirect effect: mixed-effects modeling approaches
Social buffering as an indirect effect: mixed-effects modeling approaches
The potential for an individual's social partners to buffer--or otherwise modify--how individuals respond to their environment has been demonstrated to be important in many context...
Independent Set in Neutrosophic Graphs
Independent Set in Neutrosophic Graphs
New setting is introduced to study neutrosophic independent number and independent neutrosophic-number arising neighborhood of different vertices. Neighbor is a key term to have th...
Failed Independent Number in Neutrosophic Graphs
Failed Independent Number in Neutrosophic Graphs
New setting is introduced to study neutrosophic failed-independent number and failed independent neutrosophic-number arising neighborhood of different vertices. Neighbor is a key t...

Back to Top