Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Can the nucleotide content of a DNA sequence predict the sequence accessibility?

View through CrossRef
Sequence accessibility is an important factor affecting gene expression. Sequence accessibility or openness impacts the likelihood that a gene is transcribed and translated into a protein and performs functions and manifests traits. The DNA, which carries the genes, is packaged as chromatin. There are two types of chromatin, heterochromatin and euchromatin. Heterochromatin tends to be inaccessible and thus is often not expressed. In contrast, euchromatin is more accessible and is expressed. Accessibility of a gene depends on the type of chromatin it is in, and with increased accessibility, there is a greater likelihood of gene transcription and expression. There are many potential factors that affect the accessibility of a gene. In this study, our hypothesis was that the content of nucleotides in a genetic sequence predicts its accessibility. Using a machine learning linear regression model, we studied the relationship between nucleotide content and accessibility. DNA sequences are made up of four nucleotides. We compared the quantity of each of these four nucleotides, adenosine, thymine, guanine, and cytosine either as single nucleotide or in specific combinations of two nucleotides with sequence accessibility using the K562 cell line. Of all the combinations tried, we discovered that the cytosine-guanine combination content had the highest positive correlation with accessibility, and therefore with gene expression. This correlation allows us to better predict which genetic sequences will be more frequently expressed based solely on the nucleotide content and sequence. Predicting gene expression through machine learning algorithms promises to catalyze our ability to understand the structure and function of specific gene sequences.
Title: Can the nucleotide content of a DNA sequence predict the sequence accessibility?
Description:
Sequence accessibility is an important factor affecting gene expression.
Sequence accessibility or openness impacts the likelihood that a gene is transcribed and translated into a protein and performs functions and manifests traits.
The DNA, which carries the genes, is packaged as chromatin.
There are two types of chromatin, heterochromatin and euchromatin.
Heterochromatin tends to be inaccessible and thus is often not expressed.
In contrast, euchromatin is more accessible and is expressed.
Accessibility of a gene depends on the type of chromatin it is in, and with increased accessibility, there is a greater likelihood of gene transcription and expression.
There are many potential factors that affect the accessibility of a gene.
In this study, our hypothesis was that the content of nucleotides in a genetic sequence predicts its accessibility.
Using a machine learning linear regression model, we studied the relationship between nucleotide content and accessibility.
DNA sequences are made up of four nucleotides.
We compared the quantity of each of these four nucleotides, adenosine, thymine, guanine, and cytosine either as single nucleotide or in specific combinations of two nucleotides with sequence accessibility using the K562 cell line.
Of all the combinations tried, we discovered that the cytosine-guanine combination content had the highest positive correlation with accessibility, and therefore with gene expression.
This correlation allows us to better predict which genetic sequences will be more frequently expressed based solely on the nucleotide content and sequence.
Predicting gene expression through machine learning algorithms promises to catalyze our ability to understand the structure and function of specific gene sequences.

Related Results

Genome wide hypomethylation and youth-associated DNA gap reduction promoting DNA damage and senescence-associated pathogenesis
Genome wide hypomethylation and youth-associated DNA gap reduction promoting DNA damage and senescence-associated pathogenesis
Abstract Background: Age-associated epigenetic alteration is the underlying cause of DNA damage in aging cells. Two types of youth-associated DNA-protection epigenetic mark...
Genome wide hypomethylation and youth-associated DNA gap reduction promoting DNA damage and senescence-associated pathogenesis
Genome wide hypomethylation and youth-associated DNA gap reduction promoting DNA damage and senescence-associated pathogenesis
Introduction: The United States currently faces two opioid crises, an evolved crisis currently manifesting as widespread abuse of illicit opioids, and a crisis in pain management l...
Echinococcus granulosus in Environmental Samples: A Cross-Sectional Molecular Study
Echinococcus granulosus in Environmental Samples: A Cross-Sectional Molecular Study
Abstract Introduction Echinococcosis, caused by tapeworms of the Echinococcus genus, remains a significant zoonotic disease globally. The disease is particularly prevalent in areas...
Spatial control of protein binding with DNA nanostructures
Spatial control of protein binding with DNA nanostructures
<p dir="ltr">The physical and chemical properties of DNA, including its structure predictability thanks to Watson-Crick base pairing, make it into an obvious polymer of choic...
Spatial control of protein binding with DNA nanostructures
Spatial control of protein binding with DNA nanostructures
<p dir="ltr">The physical and chemical properties of DNA, including its structure predictability thanks to Watson-Crick base pairing, make it into an obvious polymer of choic...
Abstract 4679: A novel assay to predict susceptibility to tobacco-induced disease.
Abstract 4679: A novel assay to predict susceptibility to tobacco-induced disease.
Abstract Background: Tobacco misuse is the leading preventable cause of morbidity and mortality in the world. Tobacco-induced DNA damage is one of the main mechanism...
The roles of HMGB1‐produced DNA gaps in DNA protection and aging biomarker reversal
The roles of HMGB1‐produced DNA gaps in DNA protection and aging biomarker reversal
AbstractThe endogenous DNA damage triggering an aging progression in the elderly is prevented in the youth, probably by naturally occurring DNA gaps. Decreased DNA gaps are found d...
The Conjugative Relaxase TrwC Promotes Integration of Foreign DNA in the Human Genome
The Conjugative Relaxase TrwC Promotes Integration of Foreign DNA in the Human Genome
ABSTRACT Bacterial conjugation is a mechanism of horizontal DNA transfer. The relaxase TrwC of the conjugative plasmid R388 cleaves one strand of the transfe...

Back to Top