Javascript must be enabled to continue!
Reverse-Complement Equivariant Networks for DNA Sequences
View through CrossRef
AbstractAs DNA sequencing technologies keep improving in scale and cost, there is a growing need to develop machine learning models to analyze DNA sequences, e.g., to decipher regulatory signals from DNA fragments bound by a particular protein of interest. As a double helix made of two complementary strands, a DNA fragment can be sequenced as two equivalent, so-called Reverse Complement (RC) sequences of nucleotides. To take into account this inherent symmetry of the data in machine learning models can facilitate learning. In this sense, several authors have recently proposed particular RC-equivariant convolutional neural networks (CNNs). However, it remains unknown whether other RC-equivariant architectures exist, which could potentially increase the set of basic models adapted to DNA sequences for practitioners. Here, we close this gap by characterizing the set of all linear RC-equivariant layers, and show in particular that new architectures exist beyond the ones already explored. We further discuss RC-equivariant pointwise nonlinearities adapted to different architectures, as well as RC-equivariant embeddings of k-mers as an alternative to one-hot encoding of nucleotides. We show experimentally that the new architectures can outperform existing ones.
Title: Reverse-Complement Equivariant Networks for DNA Sequences
Description:
AbstractAs DNA sequencing technologies keep improving in scale and cost, there is a growing need to develop machine learning models to analyze DNA sequences, e.
g.
, to decipher regulatory signals from DNA fragments bound by a particular protein of interest.
As a double helix made of two complementary strands, a DNA fragment can be sequenced as two equivalent, so-called Reverse Complement (RC) sequences of nucleotides.
To take into account this inherent symmetry of the data in machine learning models can facilitate learning.
In this sense, several authors have recently proposed particular RC-equivariant convolutional neural networks (CNNs).
However, it remains unknown whether other RC-equivariant architectures exist, which could potentially increase the set of basic models adapted to DNA sequences for practitioners.
Here, we close this gap by characterizing the set of all linear RC-equivariant layers, and show in particular that new architectures exist beyond the ones already explored.
We further discuss RC-equivariant pointwise nonlinearities adapted to different architectures, as well as RC-equivariant embeddings of k-mers as an alternative to one-hot encoding of nucleotides.
We show experimentally that the new architectures can outperform existing ones.
Related Results
Equivariant parametrized topological complexity
Equivariant parametrized topological complexity
AbstractIn this paper, we define and study an equivariant analogue of Cohen, Farber and Weinberger’s parametrized topological complexity. We show that several results in the non-eq...
Genome wide hypomethylation and youth-associated DNA gap reduction promoting DNA damage and senescence-associated pathogenesis
Genome wide hypomethylation and youth-associated DNA gap reduction promoting DNA damage and senescence-associated pathogenesis
Abstract
Background: Age-associated epigenetic alteration is the underlying cause of DNA damage in aging cells. Two types of youth-associated DNA-protection epigenetic mark...
Echinococcus granulosus in Environmental Samples: A Cross-Sectional Molecular Study
Echinococcus granulosus in Environmental Samples: A Cross-Sectional Molecular Study
Abstract
Introduction
Echinococcosis, caused by tapeworms of the Echinococcus genus, remains a significant zoonotic disease globally. The disease is particularly prevalent in areas...
Time-reversal equivariant neural network potential and Hamiltonian for magnetic materials
Time-reversal equivariant neural network potential and Hamiltonian for magnetic materials
This work presents Time-reversal Equivariant Neural Network (TENN) framework. With TENN, the time-reversal symmetry is considered in the equivariant neural network (ENN), which gen...
Controlled Self-Assembly of λ-DNA Networks with the Synergistic Effect of DC Electric Field
Controlled Self-Assembly of λ-DNA Networks with the Synergistic Effect of DC Electric Field
AbstractLarge-scale and morphologically controlled self-assembled λ-DNA networks were successfully constructed by the synergistic effect of DC electric field. The effect of DNA con...
Complement Activation and Its Implication in the Pathophysiology of Hemolytic Anemia and Aging in Mouse Models of Sickle Cell Disease and Beta-Thalassemia
Complement Activation and Its Implication in the Pathophysiology of Hemolytic Anemia and Aging in Mouse Models of Sickle Cell Disease and Beta-Thalassemia
Introduction: complement activation plays a crucial role in the immune response and has been increasingly recognized for its involvement in various hemolytic anemias. In sickle cel...
Inhibition of the Complement Alternative Pathway Attenuates Hemolysis and Preserves Renal Function in a Mouse Model of Sickle Cell Disease
Inhibition of the Complement Alternative Pathway Attenuates Hemolysis and Preserves Renal Function in a Mouse Model of Sickle Cell Disease
Introduction: the alternative pathway (AP) of complement activation plays a significant role in the pathophysiology of sickle cell disease (SCD), contributing to hemolysis and subs...
Complement genetics
Complement genetics
Abstract
With the advent of recombinant DNA methodology about 20 years ago, tremendous progress has been made in the definition of the genetic basis of the complemen...

