Javascript must be enabled to continue!
Permutation compression with applications to genomic data
View through CrossRef
[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] High Sequencing Technology generates data at an increasing rate. The technology is used widely in molecular biology. Technologies similar to this are producing sequences at comparatively lower cost than the cost of storing the massive sequences. In applications like gene rearrangement, genomic comparison, and extraction of phylogenetic information, the order of genes is important as comparing the chromosomes can give insights into the similarity or difference in the positions in gene arrangement. Since the chromosomes are represented as permutations of genes, for large genomic species like Human, these permutations are needed to be handled efficiently in terms of storage. This requires an effcient compression technique that could store the permutations in compact form with reduced the size. This can reduce the cost of storage as well as save time and effort to retrieve the required information about the position of the gene. In computer science, traditional compression techniques which are used widely for compressing integers are developed around the frequency of symbols in the dataset. However, in permutation where each integer appears exactly once, the existing algorithms developed so far cannot be used to compress these types of datasets. This opens-up the area for the researchers to develop novel compression technique that could exploit the characteristics of such type of data represented using notation of permutations. This research presents a novel compression algorithm developed by utilizing the unique feature of genomic data.
Title: Permutation compression with applications to genomic data
Description:
[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.
] High Sequencing Technology generates data at an increasing rate.
The technology is used widely in molecular biology.
Technologies similar to this are producing sequences at comparatively lower cost than the cost of storing the massive sequences.
In applications like gene rearrangement, genomic comparison, and extraction of phylogenetic information, the order of genes is important as comparing the chromosomes can give insights into the similarity or difference in the positions in gene arrangement.
Since the chromosomes are represented as permutations of genes, for large genomic species like Human, these permutations are needed to be handled efficiently in terms of storage.
This requires an effcient compression technique that could store the permutations in compact form with reduced the size.
This can reduce the cost of storage as well as save time and effort to retrieve the required information about the position of the gene.
In computer science, traditional compression techniques which are used widely for compressing integers are developed around the frequency of symbols in the dataset.
However, in permutation where each integer appears exactly once, the existing algorithms developed so far cannot be used to compress these types of datasets.
This opens-up the area for the researchers to develop novel compression technique that could exploit the characteristics of such type of data represented using notation of permutations.
This research presents a novel compression algorithm developed by utilizing the unique feature of genomic data.
Related Results
Differential Diagnosis of Neurogenic Thoracic Outlet Syndrome: A Review
Differential Diagnosis of Neurogenic Thoracic Outlet Syndrome: A Review
Abstract
Thoracic outlet syndrome (TOS) is a complex and often overlooked condition caused by the compression of neurovascular structures as they pass through the thoracic outlet. ...
A Novel Image Encryption Algorithm Based on Double Permutation and Random Diffusion
A Novel Image Encryption Algorithm Based on Double Permutation and Random Diffusion
Abstract
To improve the image transmission security, an image encryption algorithm based on double permutation with random diffusion is proposed in this paper. This algorit...
Improving the performance of 3D image model compression based on optimized DEFLATE algorithm
Improving the performance of 3D image model compression based on optimized DEFLATE algorithm
AbstractThis study focuses on optimizing and designing the Delayed-Fix-Later Awaiting Transmission Encoding (DEFLATE) algorithm to enhance its compression performance and reduce th...
New structure of algebras using permutations in symmetric groups
New structure of algebras using permutations in symmetric groups
The permutation BG-algebras were first introduced as a novel kind of algebra. In this work, their basic qualities were investigated to better understand how they relate to one anot...
Lossless Compression Method for Medical Image Sequences Using Super-Spatial Structure Prediction and Inter-frame Coding
Lossless Compression Method for Medical Image Sequences Using Super-Spatial Structure Prediction and Inter-frame Coding
Space research organizations, hospitals and military air surveillance activities, among others, produce a huge amount of data in the form of images hence a large storage space is r...
Comparison of PCA and Autoencoder Compression for Telemetry of Logging-While-Drilling NMR Measurements
Comparison of PCA and Autoencoder Compression for Telemetry of Logging-While-Drilling NMR Measurements
Compression is an essential aspect of real-time operations as the bandwidth of transmitted information is very limited during logging while drilling. Processing of nuclear magnetic...
ADVANTAGES OF PERMUTATION (RANDOMIZATION) TESTS IN CLINICAL AND EXPERIMENTAL PHARMACOLOGY AND PHYSIOLOGY
ADVANTAGES OF PERMUTATION (RANDOMIZATION) TESTS IN CLINICAL AND EXPERIMENTAL PHARMACOLOGY AND PHYSIOLOGY
SUMMARY1. The statistical procedures that are used most commonly in clinical and experimental pharmacology and physiology are designed to test for differences between two means.2. ...

