Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

iEnhancer-CLA: Self-attention-based interpretable model for enhancers and their strength prediction

View through CrossRef
Abstract Enhancer is a class of non-coding DNA cis-acting elements that plays a crucial role in the development of eukaryotes for their transcription. Computational methods for predicting enhancers have been developed and achieve satisfactory performance. However, existing computational methods suffer from experience-based feature engineering and lack of interpretability, which not only limit the representation ability of the models to some extent, but also make it difficult to provide interpretable analysis of the model prediction findings.In this paper, we propose a novel deep-learning-based model, iEnhancer-CLA, for identifying enhancers and their strengths. Specifically, iEnhancer-CLA automatically learns sequence 1D features through multiscale convolutional neural networks (CNN), and employs a self-attention mechanism to represent global features formed by multiple elements (multibody effects). In particular, the model can provide an interpretable analysis of the enhancer motifs and key base signals by decoupling CNN modules and generating self-attention weights. To avoid the bias of setting hyperparameters manually, we construct Bayesian optimization methods to obtain model global optimization hyperparameters. The results demonstrate that our method outperforms existing predictors in terms of accuracy for identifying enhancers and their strengths. Importantly, our analyses found that the distribution of bases in enhancers is uneven and the base G contents are more enriched, while the distribution of bases in non-enhancers is relatively even. This result contributes to the improvement of prediction performance and thus facilitates revealing an in-depth understanding of the potential functional mechanisms of enhancers. Author summary The enhancers contain many subspecies and the accuracy of existing models is difficult to improve due to the small data set. Motivated by the need for accurate and efficient methods to predict enhancer types, we developed a self-attention deep learning model iEnhancer-CLA, the aim is to be able to distinguish effectively and quickly between subspecies of enhancers and whether they are enhancers or not. The model is able to learn sequence features effectively through the combination of multi-scale CNN blocks, BLSTM layers, and self-attention mechanisms, thus improving the accuracy of the model. Encouragingly, by decoupling the CNN layer it was found that the layer was effective in learning the motif of the sequences, which in combination with the self-attention weights could provide interpretability to the model. We further performed sequence analysis in conjunction with the model-generated weights and discovered differences in enhancer and non-enhancer sequence characteristics. This phenomenon can be a guide for the construction of subsequent models for identifying enhancer sequences.
Title: iEnhancer-CLA: Self-attention-based interpretable model for enhancers and their strength prediction
Description:
Abstract Enhancer is a class of non-coding DNA cis-acting elements that plays a crucial role in the development of eukaryotes for their transcription.
Computational methods for predicting enhancers have been developed and achieve satisfactory performance.
However, existing computational methods suffer from experience-based feature engineering and lack of interpretability, which not only limit the representation ability of the models to some extent, but also make it difficult to provide interpretable analysis of the model prediction findings.
In this paper, we propose a novel deep-learning-based model, iEnhancer-CLA, for identifying enhancers and their strengths.
Specifically, iEnhancer-CLA automatically learns sequence 1D features through multiscale convolutional neural networks (CNN), and employs a self-attention mechanism to represent global features formed by multiple elements (multibody effects).
In particular, the model can provide an interpretable analysis of the enhancer motifs and key base signals by decoupling CNN modules and generating self-attention weights.
To avoid the bias of setting hyperparameters manually, we construct Bayesian optimization methods to obtain model global optimization hyperparameters.
The results demonstrate that our method outperforms existing predictors in terms of accuracy for identifying enhancers and their strengths.
Importantly, our analyses found that the distribution of bases in enhancers is uneven and the base G contents are more enriched, while the distribution of bases in non-enhancers is relatively even.
This result contributes to the improvement of prediction performance and thus facilitates revealing an in-depth understanding of the potential functional mechanisms of enhancers.
Author summary The enhancers contain many subspecies and the accuracy of existing models is difficult to improve due to the small data set.
Motivated by the need for accurate and efficient methods to predict enhancer types, we developed a self-attention deep learning model iEnhancer-CLA, the aim is to be able to distinguish effectively and quickly between subspecies of enhancers and whether they are enhancers or not.
The model is able to learn sequence features effectively through the combination of multi-scale CNN blocks, BLSTM layers, and self-attention mechanisms, thus improving the accuracy of the model.
Encouragingly, by decoupling the CNN layer it was found that the layer was effective in learning the motif of the sequences, which in combination with the self-attention weights could provide interpretability to the model.
We further performed sequence analysis in conjunction with the model-generated weights and discovered differences in enhancer and non-enhancer sequence characteristics.
This phenomenon can be a guide for the construction of subsequent models for identifying enhancer sequences.

Related Results

Lipid nanocarriers : a novel approach to delivering ophthalmic clarithromycin
Lipid nanocarriers : a novel approach to delivering ophthalmic clarithromycin
The feasibility of incorporating clarithromycin (CLA) into innovative solid lipid nanoparticles (SLN) and nanostructured lipi d carriers (NLC) using hot emulsifi...
Modulation of prostaglandin H synthase activity by conjugated linoleic acid (CLA) and specific CLA isomers
Modulation of prostaglandin H synthase activity by conjugated linoleic acid (CLA) and specific CLA isomers
AbstractConjugated linoleic acid (CLA) has been shown to inhibit tumorigenesis in animal models and is cytostatic to numerous cell lines in vitro. However, the mechanism of action ...
Conjugated linoleic acid modulates hepatic lipid composition in mice
Conjugated linoleic acid modulates hepatic lipid composition in mice
AbstractConjugated linoleic acid (CLA) is a chemoprotective fatty acid that inhibits mammary, colon, forestomach, and skin carcinogenesis in experimental animals. We hypothesize th...
Automated identification of cell-type–specific genes and alternative promoters
Automated identification of cell-type–specific genes and alternative promoters
Abstract Background Identifying key transcriptional features, such as genes or transcripts, involved in cellular differentiatio...
Comprehensive review on prevalence of caseous lymphadenitis (CLA) in dairy goats: A systematic review and meta-analysis
Comprehensive review on prevalence of caseous lymphadenitis (CLA) in dairy goats: A systematic review and meta-analysis
Caseous lymphadenitis (CLA) presents a significant challenge to the dairy goat industry worldwide, negatively affecting animal health, productivity, and economic sustainability. It...
Epigenetic Remodeling in Human Coronary Artery Smooth Muscle Cell Phenotypic Switching
Epigenetic Remodeling in Human Coronary Artery Smooth Muscle Cell Phenotypic Switching
AbstractBackgroundSmooth muscle cell (SMC) dedifferentiation contributes to repair and remodeling, but also cardiovascular pathologies. To understand this plasticity, the epigeneti...
Conjugated linoleic acid modulation of phorbol ester‐induced events in murine keratinocytes
Conjugated linoleic acid modulation of phorbol ester‐induced events in murine keratinocytes
AbstractRecent work in our lab has shown that the chemoprotective fatty acid, conjugated linoleic acid (CLA), inhibits phorbol ester skin tumor promotion in mice. Because little is...
Goat Milk Fat Naturally Enriched with Conjugated Linoleic Acid Increased Lipoproteins and Reduced Triacylglycerol in Rats
Goat Milk Fat Naturally Enriched with Conjugated Linoleic Acid Increased Lipoproteins and Reduced Triacylglycerol in Rats
Goat milk is source of different lipids, including conjugated linoleic acid (CLA). CLA reduces body fat and protect against cardiovascular diseases. In the present study fat from g...

Back to Top