Javascript must be enabled to continue!
pySeqRNA: an automated Python package for RNA sequencing data analysis
View through CrossRef
With the advent of Next-Generation Sequencing (NGS) technologies, numerous data is being generated every day, however, analysis remains a big hurdle to efficiently use the technology as this data requires complex multi-step processing and demands computational expertise from the user. A large number of algorithms, statistical methods, and software tools have been developed in recent years to perform the individual analysis steps of various NGS applications. Some NGS applications data analysis procedures are therefore very complex, requiring several program tools for their various processing steps. As a result, there is a strong need for scalable computing environments that link the individual software components to automated workflows to efficiently and reproducibly conduct complex genome-wide analyses. The Python programming language currently only has inadequate general-purpose NGS workflow solutions. Therefore, for theoretical and analytical scientists who use Python for NGS data processing, a workflow system for federating NGS applications from within Python would have many benefits. To conquer this limitation, we have developed a Python package (pySeqRNA) which is capable of running the NGS data analysis from start to finish reproducibly and efficiently. This package provides a uniform workflow interface and support for running python, and stand-alone tool on the High-Performance Computing Cluster (HPCC) as well as on local computers. This is a flexible pipeline that can handle complex experiments and samples, and whether a reference genome is available or not. It is an extensible environment written in Python for performing end-to-end analysis with automated report generation for various NGS applications like RNA-Seq, VAR-Seq, ChiP-Seq, Single Cell RNA-Seq, dual RNA-Seq, etc. To simplify the analysis of these applications, the package provides pre-configured analysis and report templates. More analysis templates will be added in the coming future. pySeqRNA workflow consists of quality check and pre-processing of raw sequence reads, accurate mapping of millions of short sequencing reads to a reference genome including the identification of splicing events, quantifying expression levels of genes, transcripts, and exons in three ways: (i) Uniquely mapped reads, (ii) Multi-mapped reads to the same gene, and (iii) Multi-mapped groups, and Differential analysis of gene expression among different biological conditions, biological interpretation of differentially expressed genes, including functional enrichment analysis. This package accelerates the retrieval of reproducible results from NGS experiments. By integrating several command-line tools and custom Python scripts, it allows an effective use of existing software and tools with newly written scripts in Python without restricting users to a collection of pre-defined methods and environments. pySeqRNA is freely available at
http://bioinfo.usu.edu/pySeqRNA
/.
Title: pySeqRNA: an automated Python package for RNA sequencing data analysis
Description:
With the advent of Next-Generation Sequencing (NGS) technologies, numerous data is being generated every day, however, analysis remains a big hurdle to efficiently use the technology as this data requires complex multi-step processing and demands computational expertise from the user.
A large number of algorithms, statistical methods, and software tools have been developed in recent years to perform the individual analysis steps of various NGS applications.
Some NGS applications data analysis procedures are therefore very complex, requiring several program tools for their various processing steps.
As a result, there is a strong need for scalable computing environments that link the individual software components to automated workflows to efficiently and reproducibly conduct complex genome-wide analyses.
The Python programming language currently only has inadequate general-purpose NGS workflow solutions.
Therefore, for theoretical and analytical scientists who use Python for NGS data processing, a workflow system for federating NGS applications from within Python would have many benefits.
To conquer this limitation, we have developed a Python package (pySeqRNA) which is capable of running the NGS data analysis from start to finish reproducibly and efficiently.
This package provides a uniform workflow interface and support for running python, and stand-alone tool on the High-Performance Computing Cluster (HPCC) as well as on local computers.
This is a flexible pipeline that can handle complex experiments and samples, and whether a reference genome is available or not.
It is an extensible environment written in Python for performing end-to-end analysis with automated report generation for various NGS applications like RNA-Seq, VAR-Seq, ChiP-Seq, Single Cell RNA-Seq, dual RNA-Seq, etc.
To simplify the analysis of these applications, the package provides pre-configured analysis and report templates.
More analysis templates will be added in the coming future.
pySeqRNA workflow consists of quality check and pre-processing of raw sequence reads, accurate mapping of millions of short sequencing reads to a reference genome including the identification of splicing events, quantifying expression levels of genes, transcripts, and exons in three ways: (i) Uniquely mapped reads, (ii) Multi-mapped reads to the same gene, and (iii) Multi-mapped groups, and Differential analysis of gene expression among different biological conditions, biological interpretation of differentially expressed genes, including functional enrichment analysis.
This package accelerates the retrieval of reproducible results from NGS experiments.
By integrating several command-line tools and custom Python scripts, it allows an effective use of existing software and tools with newly written scripts in Python without restricting users to a collection of pre-defined methods and environments.
pySeqRNA is freely available at
http://bioinfo.
usu.
edu/pySeqRNA
/.
Related Results
Detecting RNA–RNA interactome
Detecting RNA–RNA interactome
AbstractThe last decade has seen a robust increase in various types of novel RNA molecules and their complexity in gene regulation. RNA molecules play a critical role in cellular e...
B-247 BLADE-R: streamlined RNA extraction for clinical diagnostics and high-throughput applications
B-247 BLADE-R: streamlined RNA extraction for clinical diagnostics and high-throughput applications
Abstract
Background
Efficient nucleic acid extraction and purification are crucial for cellular and molecular biology research, ...
Basic and Advance: Phython Programming
Basic and Advance: Phython Programming
"This book will introduce you to the python programming language. It's aimed at beginning programmers, but even if you have written programs before and just want to add python to y...
MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing v1
MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing v1
Human tissues comprise trillions of cells that populate a complex space of molecular phenotypes and functions and that vary in abundance by 4–9 orders of magnitude. Relying solely ...
Detection of Multiple Types of Cancer Driver Mutations Using Targeted RNA Sequencing in NSCLC
Detection of Multiple Types of Cancer Driver Mutations Using Targeted RNA Sequencing in NSCLC
ABSTRACTCurrently, DNA and RNA are used separately to capture different types of gene mutations. DNA is commonly used for the detection of SNVs, indels and CNVs; RNA is used for an...
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract P1-05-23: Utilities and challenges of RNA-Seq based expression and variant calling in a clinical setting
Abstract
Introduction
Variant calling based on DNA samples has been the gold standard of clinical testing since the advent of Sanger sequencing. The u...
Next Generation Sequencing Technologies and Their Applications
Next Generation Sequencing Technologies and Their Applications
Abstract
The advances in next generation sequencing (NGS) technologies have tremendous impacts on the studies of structural and f...
Molecular Drivers of RNA Phase Separation
Molecular Drivers of RNA Phase Separation
Abstract
RNA molecules are essential in orchestrating the assembly of biomolecular condensates and membraneless compartments in cells. Many condensates form via the...

