Javascript must be enabled to continue!
Using set theory to reduce redundancy in pathway sets
View through CrossRef
1.Abstract1.01BackgroundThe consolidation of pathway databases, such as KEGG[1], Reactome[2]and ConsensusPathDB[3], has generated widespread biological interest, however the issue of pathway redundancy impedes the use of these consolidated datasets. Attempts to reduce this redundancy have focused on visualizing pathway overlap or merging pathways, but the resulting pathways may be of heterogeneous sizes and cover multiple biological functions. Efforts have also been made to deal with redundancy in pathway data by consolidating enriched pathways into a number of clusters or concepts. We present an alternative approach, which generates pathway subsets capable of covering all of genes presented within either pathway databases or enrichment results, generating substantial reductions in redundancy.1.02ResultsWe propose a method that uses set cover to reduce pathway redundancy, without merging pathways. The proposed approach considers three objectives: removal of pathway redundancy, controlling pathway size and coverage of the gene set. By applying set cover to the ConsensusPathDB dataset we were able to produce a reduced set of pathways, representing 100% of the genes in the original data set with 74% less redundancy, or 95% of the genes with 88% less redundancy. We also developed an algorithm to simplify enrichment data and applied it to a set of enriched osteoarthritis pathways, revealing that within the top ten pathways, five were redundant subsets of more enriched pathways. Applying set cover to the enrichment results removed these redundant pathways allowing more informative pathways to take their place.1.03ConclusionOur method provides an alternative approach for handling pathway redundancy, while ensuring that the pathways are of homogeneous size and gene coverage is maximised. Pathways are not altered from their original form, allowing biological knowledge regarding the data set to be directly applicable. We demonstrate the ability of the algorithms to prioritise redundancy reduction, pathway size control or gene set coverage. The application of set cover to pathway enrichment results produces an optimised summary of the pathways that best represent the differentially regulated gene set.
Title: Using set theory to reduce redundancy in pathway sets
Description:
1.
Abstract1.
01BackgroundThe consolidation of pathway databases, such as KEGG[1], Reactome[2]and ConsensusPathDB[3], has generated widespread biological interest, however the issue of pathway redundancy impedes the use of these consolidated datasets.
Attempts to reduce this redundancy have focused on visualizing pathway overlap or merging pathways, but the resulting pathways may be of heterogeneous sizes and cover multiple biological functions.
Efforts have also been made to deal with redundancy in pathway data by consolidating enriched pathways into a number of clusters or concepts.
We present an alternative approach, which generates pathway subsets capable of covering all of genes presented within either pathway databases or enrichment results, generating substantial reductions in redundancy.
1.
02ResultsWe propose a method that uses set cover to reduce pathway redundancy, without merging pathways.
The proposed approach considers three objectives: removal of pathway redundancy, controlling pathway size and coverage of the gene set.
By applying set cover to the ConsensusPathDB dataset we were able to produce a reduced set of pathways, representing 100% of the genes in the original data set with 74% less redundancy, or 95% of the genes with 88% less redundancy.
We also developed an algorithm to simplify enrichment data and applied it to a set of enriched osteoarthritis pathways, revealing that within the top ten pathways, five were redundant subsets of more enriched pathways.
Applying set cover to the enrichment results removed these redundant pathways allowing more informative pathways to take their place.
1.
03ConclusionOur method provides an alternative approach for handling pathway redundancy, while ensuring that the pathways are of homogeneous size and gene coverage is maximised.
Pathways are not altered from their original form, allowing biological knowledge regarding the data set to be directly applicable.
We demonstrate the ability of the algorithms to prioritise redundancy reduction, pathway size control or gene set coverage.
The application of set cover to pathway enrichment results produces an optimised summary of the pathways that best represent the differentially regulated gene set.
Related Results
Leveraging Large Language Models for Redundancy-Aware Pathway Analysis and Deep Biological Interpretation
Leveraging Large Language Models for Redundancy-Aware Pathway Analysis and Deep Biological Interpretation
Abstract
Extracting coherent, biologically meaningful insights from vast, complex multi-omics data remains challenging. Currently, pathway enrichment analysis serves as...
Abstract 920: COP1 E3 ligase regulates response to oncogenic MAPK pathway inhibition
Abstract 920: COP1 E3 ligase regulates response to oncogenic MAPK pathway inhibition
Abstract
Oncogenically activated RAS-MAPK pathway is the driver of several cancers including the majority of non-small cell lung adenocarcinomas (NSCLC). RAS-MAPK pa...
Fuzzimetric Sets: An Integrated Platform for Both Types of Interval Fuzzy Sets
Fuzzimetric Sets: An Integrated Platform for Both Types of Interval Fuzzy Sets
Type-2 sets are the generalized “fuzzified” sets that can be used in the fuzzy system. Unlike type-1 fuzzy sets, Type-2 allow the fuzzy sets to be “fu...
A QUAD CMOS GATES CHECKING METHOD
A QUAD CMOS GATES CHECKING METHOD
The so-called Fault-Tolerant Systems (FTS) use the structural, temporal, functional, or information redundancy for the achievement of the high reliability. For example, Radiation H...
Hyper redundancy for super reliable FPGAs
Hyper redundancy for super reliable FPGAs
The subject of the research presented in the article is hyper-redundant elements and FPGA devices that can be used in highly reliable digital systems (HRDS). The current work devel...
Pathway Analysis Interpretation in the Multiomic Era
Pathway Analysis Interpretation in the Multiomic Era
In bioinformatics, pathway analyses are used to interpret biological data by mapping measured molecules with known pathways to discover their functional processes and relationships...
Pathway Analysis Interpretation in the Multiomic Era
Pathway Analysis Interpretation in the Multiomic Era
In bioinformatics, pathway analyses are used to interpret biological data by mapping measured molecules with known pathways to discover their functional processes and relationships...
Variable geometry wing-box: toward a robotic morphing wing.
Variable geometry wing-box: toward a robotic morphing wing.
The ability to vary the geometry of a wing to adapt to different flight conditions can significantly improve the performance of an aircraft. However, the realization of any morphin...

