Javascript must be enabled to continue!
On the optimal trimming of high-throughput mRNA sequence data
View through CrossRef
Abstract
The widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype. In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other individuals with different phenotypes. While these techniques are extremely powerful, careful attention to data quality is required. In particular, because high-throughput sequencing is more error-prone than traditional Sanger sequencing, quality trimming of sequence reads should be an important step in all data processing pipelines. While several software packages for quality trimming exist, no general guidelines for the specifics of trimming have been developed. Here, using empirically derived sequence data, I provide general recommendations regarding the optimal strength of trimming, specifically in mRNA-Seq studies. Although very aggressive quality trimming is common, this study suggests that a more gentle trimming, specifically of those nucleotides whose P
hred
score
<
2 or
<
5, is optimal for most studies across a wide variety of metrics.
Title: On the optimal trimming of high-throughput mRNA sequence data
Description:
Abstract
The widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype.
In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other individuals with different phenotypes.
While these techniques are extremely powerful, careful attention to data quality is required.
In particular, because high-throughput sequencing is more error-prone than traditional Sanger sequencing, quality trimming of sequence reads should be an important step in all data processing pipelines.
While several software packages for quality trimming exist, no general guidelines for the specifics of trimming have been developed.
Here, using empirically derived sequence data, I provide general recommendations regarding the optimal strength of trimming, specifically in mRNA-Seq studies.
Although very aggressive quality trimming is common, this study suggests that a more gentle trimming, specifically of those nucleotides whose P
hred
score
<
2 or
<
5, is optimal for most studies across a wide variety of metrics.
Related Results
Tissue renin angiotensin system in IgA nephropathy
Tissue renin angiotensin system in IgA nephropathy
The inhibition of angiotensin II (AngII) by use of angiotensin converting enzyme (ACE) inhibitor or AngII receptor blocker is effective for prevention of the progression of renal d...
Impairment of HuR-Mediated FOS mRNA Stabilization in Granulocytes From Myelodysplastic Syndrome Patients.
Impairment of HuR-Mediated FOS mRNA Stabilization in Granulocytes From Myelodysplastic Syndrome Patients.
Abstract
Abstract 2805
Infection is a major cause of death in patients with myelodysplastic syndromes (MDS). Although qualitative and quantitative gra...
Study of Deformation and Fracture of High Strength Steel Sheet during Conventional and Robust Trimming by Conducting Partial Trimming Tests
Study of Deformation and Fracture of High Strength Steel Sheet during Conventional and Robust Trimming by Conducting Partial Trimming Tests
Abstract
High-strength steels are used in the automotive industry for weight reduction and improved vehicle crashworthiness. In this work, an instrumented trimming die equi...
Mechanism of Tripeptide Trimming by γ-Secretase
Mechanism of Tripeptide Trimming by γ-Secretase
Abstract
The membrane-embedded γ-secretase complex processively cleaves within the transmembrane domain of amyloid precursor protein (APP) to pro...
Trimming Operations
Trimming Operations
Abstract
Trimming is the removal of excess metal from a stamped part to allow the part to reach the finished stage or to prepare it for subsequent operations. This a...
Recent Advances in mRNA Vaccine Development
Recent Advances in mRNA Vaccine Development
Traditional vaccines are produced by using weakened or inactivated forms of disease-causing pathogens to produce the target antigen they are designed to protect against. Messenger ...
SeqPurge: highly-sensitive adapter trimming for paired-end short read data
SeqPurge: highly-sensitive adapter trimming for paired-end short read data
Trimming adapter sequences from short read data is a common preprocessing step in most DNA/RNA sequence analysis pipelines. For amplicon-based approaches, which are mostly used in ...
Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming
Statistical inference reveals the role of length, GC content, and local sequence in V(D)J nucleotide trimming
To appropriately defend against a wide array of pathogens, humans somatically generate highly diverse repertoires of B cell and T cell receptors (BCRs and TCRs) through a random pr...

