Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

libFLASM: a software library for fixed-length approximate string matching

View through CrossRef
Abstract Background Approximate string matching is the problem of finding all factors of a given text that are at a distance at most k from a given pattern. Fixed-length approximate string matching is the problem of finding all factors of a text of length n that are at a distance at most k from any factor of length ℓ of a pattern of length m. There exist bit-vector techniques to solve the fixed-length approximate string matching problem in time $\mathcal {O}(m\lceil \ell /w \rceil n)$ O ( m ⌈ ℓ / w ⌉ n ) and space $\mathcal {O}(m\lceil \ell /w\rceil)$ O ( m ⌈ ℓ / w ⌉ ) under the edit and Hamming distance models, where w is the size of the computer word; as such these techniques are independent of the distance threshold k or the alphabet size. Fixed-length approximate string matching is a generalisation of approximate string matching and, hence, has numerous direct applications in computational molecular biology and elsewhere. Results We present and make available , a free open-source software library for solving fixed-length approximate string matching under both the edit and the Hamming distance models. Moreover we describe how fixed-length approximate string matching is applied to solve real problems by incorporating into established applications for multiple circular sequence alignment as well as single and structured motif extraction. Specifically, we describe how it can be used to improve the accuracy of multiple circular sequence alignment in terms of the inferred likelihood-based phylogenies; and we also describe how it is used to efficiently find motifs in molecular sequences representing regulatory or functional regions. The comparison of the performance of the library to other algorithms show how it is competitive, especially with increasing distance thresholds. Conclusions Fixed-length approximate string matching is a generalisation of the classic approximate string matching problem. We present , a free open-source software library for solving fixed-length approximate string matching. The extensive experimental results presented here suggest that other applications could benefit from using , and thus further maintenance and development of is desirable.
Title: libFLASM: a software library for fixed-length approximate string matching
Description:
Abstract Background Approximate string matching is the problem of finding all factors of a given text that are at a distance at most k from a given pattern.
Fixed-length approximate string matching is the problem of finding all factors of a text of length n that are at a distance at most k from any factor of length ℓ of a pattern of length m.
There exist bit-vector techniques to solve the fixed-length approximate string matching problem in time $\mathcal {O}(m\lceil \ell /w \rceil n)$ O ( m ⌈ ℓ / w ⌉ n ) and space $\mathcal {O}(m\lceil \ell /w\rceil)$ O ( m ⌈ ℓ / w ⌉ ) under the edit and Hamming distance models, where w is the size of the computer word; as such these techniques are independent of the distance threshold k or the alphabet size.
Fixed-length approximate string matching is a generalisation of approximate string matching and, hence, has numerous direct applications in computational molecular biology and elsewhere.
Results We present and make available , a free open-source software library for solving fixed-length approximate string matching under both the edit and the Hamming distance models.
Moreover we describe how fixed-length approximate string matching is applied to solve real problems by incorporating into established applications for multiple circular sequence alignment as well as single and structured motif extraction.
Specifically, we describe how it can be used to improve the accuracy of multiple circular sequence alignment in terms of the inferred likelihood-based phylogenies; and we also describe how it is used to efficiently find motifs in molecular sequences representing regulatory or functional regions.
The comparison of the performance of the library to other algorithms show how it is competitive, especially with increasing distance thresholds.
Conclusions Fixed-length approximate string matching is a generalisation of the classic approximate string matching problem.
We present , a free open-source software library for solving fixed-length approximate string matching.
The extensive experimental results presented here suggest that other applications could benefit from using , and thus further maintenance and development of is desirable.

Related Results

Parameterized Strings: Algorithms and Applications
Parameterized Strings: Algorithms and Applications
The parameterized string (p-string), a generalization of the traditional string, is composed of constant and parameter symbols. A parameterized match (p-match) exists between two p...
Approximate Chinese String Matching Techniques Based on Pinyin Input Method
Approximate Chinese String Matching Techniques Based on Pinyin Input Method
String matching is one of the most typical problems in computer science. Previous studies mainly focused on accurate string matching problem. However, with the rapid development of...
Penerapan Algoritma Approximate String Matching Untuk Pencarian Teks Pada Aplikasi Ensiklopedia Teknologi Komputer
Penerapan Algoritma Approximate String Matching Untuk Pencarian Teks Pada Aplikasi Ensiklopedia Teknologi Komputer
Abstrak−Teknologi komputer adalah suatu yang diciptakan untuk kepentingan dalam pengolahan data sehingga teknologi yang dimaksud adalah perkembangan yang mana suatu sistem terdahul...
A Fast Pattern Matching Algorithm Based on Middle Characters of Pattern String
A Fast Pattern Matching Algorithm Based on Middle Characters of Pattern String
String pattern matching is one of the important string operation. At present, the pattern matching algorithm of strings mainly includes BF algorithm, KMP algorithm, and improved KM...
2021 Census to Census Coverage Survey Matching Results.
2021 Census to Census Coverage Survey Matching Results.
The 2021 England and Wales Census was matched to the Census Coverage Survey (CCS). This was an essential requisite for estimating undercount in the Census. To ensure outputs could ...
Axial Excitation Tool String Modelling
Axial Excitation Tool String Modelling
Current types of axial excitation tool have been shown to produce beneficial results — in terms of load transfer to the bit, general reductions in string friction and reductions in...
18-3/4 in. FullBore Wellhead System
18-3/4 in. FullBore Wellhead System
Abstract This paper describes the development of full-bore wellheads, a new 18-3/4 in.15,000 psi W.P. system, from conception to field installation. The wellheadw...

Back to Top