Javascript must be enabled to continue!
Clinical Feature-Related Single-Base Substitution Sequence Signatures Identified with an Unsupervised Machine Learning Approach
View through CrossRef
Abstract
Background: Mutation processes leave different signatures in genes. For single-base substitutions, previous studies have suggested that mutation signatures are not only reflected in mutation bases but also in neighboring bases. However, because of the lack of a method to identify features of long sequences next to mutation bases, the understanding of how flanking sequences influence mutation signatures is limited.Methods: We constructed a long short-term memory – self organizing map (LSTM-SOM) unsupervised neural network. By extracting mutated sequence features via LSTM and clustering similar features with the SOM, single-base substitutions in The Cancer Genome Atlas database were clustered according to both their mutation site and flanking sequences. The relationship between mutation sequence signatures and clinical features was then analyzed. Finally, we clustered patients into different classes according to the composition of the mutation sequence signatures by the K-means method and then studied the differences in clinical features and survival between classes.Results: Ten classes of mutant sequence signatures (mutation blots, MBs) were obtained from 2,141,527 single-base substitutions via LSTM-SOM machine learning approach. Different features in mutation bases and flanking sequences were revealed among MBs. MBs reflect both the site and pathological features of cancers. MBs were related to clinical features, including age, gender, and cancer stage. The class of an MB in a given gene was associated with survival. Finally, patients were clustered into 7 classes according to the MB composition. Significant differences in survival and clinical features were observed among different patient classes.Conclusions: We provided a method for analyzing the characteristics of mutant sequences. Result of this study showed that flanking sequences, together with mutation bases, shape the signatures of SBSs. MBs were shown related to clinical features and survival of cancer patients. Composition of MBs is a feasible predictive factor of clinical prognosis. Further study of the mechanism of MBs related to cancer characteristics is suggested.
Springer Science and Business Media LLC
Title: Clinical Feature-Related Single-Base Substitution Sequence Signatures Identified with an Unsupervised Machine Learning Approach
Description:
Abstract
Background: Mutation processes leave different signatures in genes.
For single-base substitutions, previous studies have suggested that mutation signatures are not only reflected in mutation bases but also in neighboring bases.
However, because of the lack of a method to identify features of long sequences next to mutation bases, the understanding of how flanking sequences influence mutation signatures is limited.
Methods: We constructed a long short-term memory – self organizing map (LSTM-SOM) unsupervised neural network.
By extracting mutated sequence features via LSTM and clustering similar features with the SOM, single-base substitutions in The Cancer Genome Atlas database were clustered according to both their mutation site and flanking sequences.
The relationship between mutation sequence signatures and clinical features was then analyzed.
Finally, we clustered patients into different classes according to the composition of the mutation sequence signatures by the K-means method and then studied the differences in clinical features and survival between classes.
Results: Ten classes of mutant sequence signatures (mutation blots, MBs) were obtained from 2,141,527 single-base substitutions via LSTM-SOM machine learning approach.
Different features in mutation bases and flanking sequences were revealed among MBs.
MBs reflect both the site and pathological features of cancers.
MBs were related to clinical features, including age, gender, and cancer stage.
The class of an MB in a given gene was associated with survival.
Finally, patients were clustered into 7 classes according to the MB composition.
Significant differences in survival and clinical features were observed among different patient classes.
Conclusions: We provided a method for analyzing the characteristics of mutant sequences.
Result of this study showed that flanking sequences, together with mutation bases, shape the signatures of SBSs.
MBs were shown related to clinical features and survival of cancer patients.
Composition of MBs is a feasible predictive factor of clinical prognosis.
Further study of the mechanism of MBs related to cancer characteristics is suggested.
Related Results
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND
As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
Manual and Machine Learning Approaches for Classifying Real and Forged Signatures—A Comparative Study and Forensic Implications
Manual and Machine Learning Approaches for Classifying Real and Forged Signatures—A Comparative Study and Forensic Implications
ABSTRACTA handwritten signature is one of the forms of a biometric measure that creates an individual identity of the persons to mark their approval related to any document. The ma...
A novel unsupervised deep learning network for intelligent fault diagnosis of rotating machinery
A novel unsupervised deep learning network for intelligent fault diagnosis of rotating machinery
Generally, the health conditions of rotating machinery are complicated and changeable. Meanwhile, its fault labeled information is mostly unknown. Therefore, it is man-sized to aut...
Substitution mutational signatures across pan-squamous cell carcinomas
Substitution mutational signatures across pan-squamous cell carcinomas
Abstract
Background
Squamous cell carcinoma (SCC) is a highly heterogeneous and aggressive cancer type with significant g...
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...
Analysis of the Cross-Study Replicability of Tuberculosis Gene Signatures Using 49 Curated Transcriptomic Datasets
Analysis of the Cross-Study Replicability of Tuberculosis Gene Signatures Using 49 Curated Transcriptomic Datasets
Background
Tuberculosis (TB) is the leading cause of infectious disease mortality worldwide. Numerous blood-based gene expression signatures have been proposed in...
A signature-based approach to quantify soil moisture dynamics under contrasting land-uses
A signature-based approach to quantify soil moisture dynamics under contrasting land-uses
Soil moisture signatures provide a promising solution to overcome the
difficulty of evaluating soil moisture dynamics in hydrologic models.
Soil moisture signatures are metrics tha...
Cancer signatures for reproducible gene expression analysis data: the computational way to achieve precision medicine
Cancer signatures for reproducible gene expression analysis data: the computational way to achieve precision medicine
Cancer is a complex disease, characterized by extensive genomic aberrations with an evident impact on gene expression regulation and cell biological processes. Many studies and som...

