Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

CDS-BART: A BART-Based Foundation Model for mRNA Sequence Analysis

View through CrossRef
Abstract Summary: Recent advancements in artificial intelligence (AI) have led to the development of foundation models that interpret mRNA as a language. Notable examples include CodonBERT, hydraRNA, EVO2, and Helix-mRNA. These models demonstrate significant potential as powerful tools for mRNA research. However, to best of our knowledge, there is currently no publicly available AI model that is both easy to use and capable of analyzing mRNA sequences up to about 4kb, a length scale typical of many therapeutic mRNAs, including those encapsulated within lipid nanoparticls (LNPs). Thus, we propose CDS-BART, a user-friendly, open-source tool that integrates SentencePiece sub-word tokenization with the denoising sequence-to-sequence training of Bidirectional and Auto-Regressive Transformers (BART). CDS-BART was pre-trained on mRNA data from nine taxonomic groups provided by the NCBI RefSeq database. This comprehensive pre-training, coupled with BART’s denoising capability, ensures effective learning of codon usage, mRNA structure, evolution, and regulation. Thus, CDS-BART can ultimately deliver robust performance across a wide range of mRNA prediction tasks. Availability and Implementation CDS-BART is released under the MIT License. Latest code is available via Github at https://github.com/mogam-ai/CDS-BART .
Title: CDS-BART: A BART-Based Foundation Model for mRNA Sequence Analysis
Description:
Abstract Summary: Recent advancements in artificial intelligence (AI) have led to the development of foundation models that interpret mRNA as a language.
Notable examples include CodonBERT, hydraRNA, EVO2, and Helix-mRNA.
These models demonstrate significant potential as powerful tools for mRNA research.
However, to best of our knowledge, there is currently no publicly available AI model that is both easy to use and capable of analyzing mRNA sequences up to about 4kb, a length scale typical of many therapeutic mRNAs, including those encapsulated within lipid nanoparticls (LNPs).
Thus, we propose CDS-BART, a user-friendly, open-source tool that integrates SentencePiece sub-word tokenization with the denoising sequence-to-sequence training of Bidirectional and Auto-Regressive Transformers (BART).
CDS-BART was pre-trained on mRNA data from nine taxonomic groups provided by the NCBI RefSeq database.
This comprehensive pre-training, coupled with BART’s denoising capability, ensures effective learning of codon usage, mRNA structure, evolution, and regulation.
Thus, CDS-BART can ultimately deliver robust performance across a wide range of mRNA prediction tasks.
Availability and Implementation CDS-BART is released under the MIT License.
Latest code is available via Github at https://github.
com/mogam-ai/CDS-BART .

Related Results

Annealing and surface treatment effect on the optical and electrical properties of n-type CdS binary compound semiconductors
Annealing and surface treatment effect on the optical and electrical properties of n-type CdS binary compound semiconductors
The preparation of CdS thin films were actualised with electrodeposition technique using cathodic voltage of 1200 milli – Volts (mV). The optical and electrical properties of three...
Tissue renin angiotensin system in IgA nephropathy
Tissue renin angiotensin system in IgA nephropathy
The inhibition of angiotensin II (AngII) by use of angiotensin converting enzyme (ACE) inhibitor or AngII receptor blocker is effective for prevention of the progression of renal d...
Impairment of HuR-Mediated FOS mRNA Stabilization in Granulocytes From Myelodysplastic Syndrome Patients.
Impairment of HuR-Mediated FOS mRNA Stabilization in Granulocytes From Myelodysplastic Syndrome Patients.
Abstract Abstract 2805 Infection is a major cause of death in patients with myelodysplastic syndromes (MDS). Although qualitative and quantitative gra...
Managing parasitic absorption and interfacial structure in Sb2S3/CdS planar heterojunction for efficient solar cells
Managing parasitic absorption and interfacial structure in Sb2S3/CdS planar heterojunction for efficient solar cells
Cadmium sulfide (CdS) is a widely utilized electron transport material (ETM) in antimony sulfide (Sb2S3) solar cells due to its superior electron mobility and favorable band alignm...
Evaluating carbon dots as electron mediators in photochemical and photocatalytic processes of NiFe2O4
Evaluating carbon dots as electron mediators in photochemical and photocatalytic processes of NiFe2O4
Spinel ferrites such as nickel ferrite are promising energy conversion photocatalysts as they are visible-light absorbers, chemically stable, earth abundant, and inexpensive. Nicke...
Predicting Currency Prices and Informational Efficiency:
Predicting Currency Prices and Informational Efficiency:
This study examine the predictive power of Credit Default Swaps (CDS) and the equity markets on currency exchange rate to determine whether the CDS is a better predictor as compare...
Electrochemical Detection of Heavy Metal Ions using Gold Nanoparticles on Carbon Dots Extracted from Curry Leaves
Electrochemical Detection of Heavy Metal Ions using Gold Nanoparticles on Carbon Dots Extracted from Curry Leaves
Carbon dots (CDs) have attracted attention due to their versatility in electronic and optical properties based on precursor and type of synthesis process. Recently, many researcher...

Back to Top