Javascript must be enabled to continue!

Discovering potential key features of genome wide profiling data using Decision Variable Analysis

The identification of key features related to phenotype of interest (POI) from high dimensional data has been one of the important issues for omics-data study, such as transcriptome or DNA methylome data. However, these data are commonly contaminated by sources of unwanted variation caused by platforms, batches or other types of biological factors. Thus, the data can be considered as a combination of variation derived from POI and other confounding factors. Not taking into consideration for these factors could lead to spurious associations and missing important signals. Based on this idea, we propose a novel feature selection method called Decision Variable Analysis (DVA) to extract the important features related to POI from the data containing potential confounding factors. Using this method on the simulated data and real data, respectively, we found DVA performed better in identifying confounding factors comparing to other methods, including linear regression and surrogate variable analysis. Especially, our method is more efficient for the data in which there are much more feature number than sample size. We show improvements of DVA across high-dimensional datasets with smaller samples size compared to feature number on different platforms. The results indicate that DVA is an effective method to dissect sources of variation for omics-data with potential confounding factors. DVA is freely available for use at [https://github.com/xvon1/DVA](https://github.com/xvon1/DVA).

Wiley

Jie Xie Feng Xie Cheng Li Weike Lu Zhen Yang

2024

Title: Discovering potential key features of genome wide profiling data using Decision Variable Analysis

Description:

However, these data are commonly contaminated by sources of unwanted variation caused by platforms, batches or other types of biological factors.

Thus, the data can be considered as a combination of variation derived from POI and other confounding factors.

Not taking into consideration for these factors could lead to spurious associations and missing important signals.

Based on this idea, we propose a novel feature selection method called Decision Variable Analysis (DVA) to extract the important features related to POI from the data containing potential confounding factors.

Using this method on the simulated data and real data, respectively, we found DVA performed better in identifying confounding factors comparing to other methods, including linear regression and surrogate variable analysis.

Especially, our method is more efficient for the data in which there are much more feature number than sample size.

We show improvements of DVA across high-dimensional datasets with smaller samples size compared to feature number on different platforms.

The results indicate that DVA is an effective method to dissect sources of variation for omics-data with potential confounding factors.

DVA is freely available for use at [https://github.

com/xvon1/DVA](https://github.

com/xvon1/DVA).

Back

Abstract Funding Acknowledgements Type of funding sources: None. INTRODUCTION Patients with heart failure (HF)...

Autonomy on Trial

Photo by CHUTTERSNAP on Unsplash Abstract This paper critically examines how US bioethics and health law conceptualize patient autonomy, contrasting the rights-based, individualist...

Pengaruh Kepemimpinan Kepala Sekolah, Lingkungan Kerja, dan Sarana Pembelajaran terhadap Kinerja Guru Melalui Motivasi Kerja

Penelitian ini mengkaji pengaruh kepemimpinan kepala sekolah, lingkungan sekolah, dan sarana pembelajaran terhadap kinerja guru SMAS Reformasi Plus, dengan motivasi guru sebagai va...

Optimising primary molecular profiling in NSCLC

Abstract Introduction Molecular profiling of NSCLC is essential for optimising treatment decisions, but often incomplete. We as...

Characterising genome architectures using Genome Decomposition Analysis

Abstract Genome architecture describes how genes and other features are arranged in genomes. These arrangements reflect the evolutionary pressure...

Whole Genome Resequencing and 1000 Genomes Project

Abstract The recent advances in sequencing technologies have enabled the whole human genome to be sequenced within weeks. To date, several human...

Next Generation Sequencing Technologies and Their Applications

Abstract The advances in next generation sequencing (NGS) technologies have tremendous impacts on the studies of structural and f...

Primary PCI: a reasonable treatment for STEMI care during the COVID-19 pandemic

Abstract Funding Acknowledgements Type of funding sources: None. Introduction ...

Email:
Password:

Email:

Discovering potential key features of genome wide profiling data using Decision Variable Analysis

Related Results