Javascript must be enabled to continue!

Inclusion of pseudogenes in the Ensembl comparative genomics resources

Pseudogenes are segments of DNA that are related to functional genes but have lost functionality, often from the accumulation of multiple mutations. Pseudogenes can thus be found and annotated using sequence similarity, and linked to a parent or source gene, giving us insights into the history of this gene. However, very few resources consider pseudogenes when running comparative genomics analyses, and pseudogenes are largely missing from the major orthology databases. Ensembl is a platform that provides integrated genomics resources of more than 100 vertebrate species and a comprehensive comparative genomics database. Currently, this includes phylogenetic trees and orthology calls across all functional genes. In this work, we extend the Ensembl resources to include pseudogenes at multiple levels to help understanding their evolution. First, we link the pseudogenes to their closest functional homologue with PseudoPipe (Zhang et al., Bioinformatics, 2006), an existing homology-based pseudogene identification pipeline. Then we update the multiple-sequence alignments and phylogenetic trees of their functional counterparts by constraining the original alignment and topology. We are thus able to supplement our orthology predictions with pseudogenes-to-functional orthologues (such as unitary pseudogenes), and between-pseudogenes orthologues. Finally, we run our quality-assessment analyses based on conservation of local gene order and congruence with whole-genome alignments. We will present the results and some statistics of this approach on a test dataset comprising human and rodents, giving insights into the recent evolution of pseudogenes. We will also present prototype Ensembl comparative genomics displays (phylogenetic tree, orthologue and paralogue lists) that include pseudogenes, and we are seeking feedback before releasing the new data in a future version of Ensembl.

F1000 Research Ltd

Guillaume Giroussens Matthieu Muffato Paul Flicek

2025

Title: Inclusion of pseudogenes in the Ensembl comparative genomics resources

Description:

Pseudogenes are segments of DNA that are related to functional genes but have lost functionality, often from the accumulation of multiple mutations.

Pseudogenes can thus be found and annotated using sequence similarity, and linked to a parent or source gene, giving us insights into the history of this gene.

However, very few resources consider pseudogenes when running comparative genomics analyses, and pseudogenes are largely missing from the major orthology databases.

Ensembl is a platform that provides integrated genomics resources of more than 100 vertebrate species and a comprehensive comparative genomics database.

Currently, this includes phylogenetic trees and orthology calls across all functional genes.

In this work, we extend the Ensembl resources to include pseudogenes at multiple levels to help understanding their evolution.

First, we link the pseudogenes to their closest functional homologue with PseudoPipe (Zhang et al.

, Bioinformatics, 2006), an existing homology-based pseudogene identification pipeline.

Then we update the multiple-sequence alignments and phylogenetic trees of their functional counterparts by constraining the original alignment and topology.

We are thus able to supplement our orthology predictions with pseudogenes-to-functional orthologues (such as unitary pseudogenes), and between-pseudogenes orthologues.

Finally, we run our quality-assessment analyses based on conservation of local gene order and congruence with whole-genome alignments.

We will present the results and some statistics of this approach on a test dataset comprising human and rodents, giving insights into the recent evolution of pseudogenes.

We will also present prototype Ensembl comparative genomics displays (phylogenetic tree, orthologue and paralogue lists) that include pseudogenes, and we are seeking feedback before releasing the new data in a future version of Ensembl.

Back

In a comprehensive and at times critical manner, this volume seeks to shed light on the development of events in Western (i.e., European and North American) comparative literature ...

Pseudogenes

Pseudogenes are ubiquitous and abundant in genomes. Pseudogenes were once called “genomic fossils” and treated as “junk DNA” several years. Nevertheless, it has been recognized tha...

Pseudogenes

AbstractPseudogenes are nonfunctional genomic regions that arise mostly from the duplication of functional genes. Two types of pseudogenes are generally formed by independent mecha...

A Phylogenomic Study Quantifies Competing Mechanisms for Pseudogenization in Prokaryotes

Abstract Background Pseudogenes are non-functional sequences in the genome with homologous sequences that ...

Changes in the expression of pseudogenes including AKR1B10P1, RP11-465B22.3, WASH8P, and NPM1P25 as a model for predicting hepatocellular carcinoma patient survival

Abstract Introduction: The goal of this study was to look at changes in pseudogene genes level as oncogenes and tumor suppressors in hepatocellular carcinoma (HCC) by large...

Changes in the expression of pseudogenes including AKR1B10P1, RP11-465B22.3, WASH8P, and NPM1P25 as a model for predicting hepatocellular carcinoma patient survival

Abstract Introduction: The goal of this study was to look at changes in pseudogene genes level as oncogenes and tumor suppressors in hepatocellular carcinoma (HCC) by larg...

Abstract 915: Pseudogene-associated recurrent gene fusion in prostate cancer

Abstract Pseudogenes are a class of non-coding genes that are dysfunctional relatives of known functional genes. Often considered as junk DNA, pseudogenes have re...

Pseudogene Transcripts in Head and Neck Cancer: Literature Review and In Silico Analysis

Once considered nonfunctional, pseudogene transcripts are now known to provide valuable information for cancer susceptibility, including head and neck cancer (HNC), a serious healt...

Email:
Password:

Email:

Inclusion of pseudogenes in the Ensembl comparative genomics resources

Related Results