Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Abstract B8: Why is there no consensus on GBM subgroups? Understanding the nature of biological and statistical variability in The Cancer Genome Atlas GBM data and the implications for molecular tumor classification

View through CrossRef
Abstract Introduction: Clinical differences among patients with glioblastoma (GBM) suggest the existence of discrete subgroups of this disease. Such groups are not recognized histologically, engendering interest in molecular classification strategies for GBM. Numerous studies have described molecular fingerprints characteristic of GBM subclasses with unique genotypes and phenotypes, but these classifications have been difficult to reproduce. Accordingly, there remains little consensus regarding GBM subclasses, their characteristic molecular signatures, and the clinically-relevant genotype-phenotype correlations of the putative classes. We hypothesize that a combination of biological and mathematical factors confound interpretation of this data, and we have undertaken a comprehensive investigation into the nature and relative contributions of these factors to inconsistencies in molecular classification of GBMs. Methods: We analyzed gene expression and clinical data for all 340 GBMs in The Cancer Genome Atlas (TCGA) profiled using the Affymetrix HT-HG-U133A platform. We created a logic model for systematically analyzing the sources of biological, technical, and mathematical variability inherent in this dataset and in the analytic strategies commonly employed in its interpretation. We then used standard linear classifiers and linear dimensionality reduction algorithms in conjunction with our logic model to investigate the nature and relative contributions of each factor. Results: Gene expression data can be used in conjunction with unbiased linear classifiers to distinguish GBMs from other tumors, suggesting that valid biological data is contained in the dataset. However, the same classifiers fail to segregate GBMs into clinically-relevant molecular subgroups. Further investigation reveals that commonly-described sources of classification error, including individual sample characteristics, batch effects, and analytic and platform (technical) noise make a measurable but proportionally minor contribution to inaccurate classification. Instead, our analysis suggests that three, previously underappreciated classes of variability may account for a larger fraction of classification errors: biologic variability (noise) among tumors, of which we describe three types; skewed data distributions incorrectly assumed to be normal; and inherent nonlinear/nonorthogonal relationships among the variables (genes) used in conjunction with classification algorithms that assume linearity. Conclusions: Technical sources of variability are often assumed to be the primary source of inaccurate molecular classification of GBMs. Our analysis of the TCGA data suggests a contributory role for these factors, and we believe that additional research in modeling this error is critical to improving classification accuracy. Notwithstanding, our analysis also suggests that three, rarely-discussed factors, biological variability, abnormal data distribution, and nonlinear relationships among genes, may, together, be responsible for a proportionally larger component of classification error. Additional research is necessary to better characterize the nature and relative magnitude of each of these effects. Subsequent efforts can then be made to develop strategies capable of more appropriately identifying and addressing these factors, thereby improving the accuracy and precision of future molecular classifiers for GBM. Citation Information: Clin Cancer Res 2010;16(7 Suppl):B8
Title: Abstract B8: Why is there no consensus on GBM subgroups? Understanding the nature of biological and statistical variability in The Cancer Genome Atlas GBM data and the implications for molecular tumor classification
Description:
Abstract Introduction: Clinical differences among patients with glioblastoma (GBM) suggest the existence of discrete subgroups of this disease.
Such groups are not recognized histologically, engendering interest in molecular classification strategies for GBM.
Numerous studies have described molecular fingerprints characteristic of GBM subclasses with unique genotypes and phenotypes, but these classifications have been difficult to reproduce.
Accordingly, there remains little consensus regarding GBM subclasses, their characteristic molecular signatures, and the clinically-relevant genotype-phenotype correlations of the putative classes.
We hypothesize that a combination of biological and mathematical factors confound interpretation of this data, and we have undertaken a comprehensive investigation into the nature and relative contributions of these factors to inconsistencies in molecular classification of GBMs.
Methods: We analyzed gene expression and clinical data for all 340 GBMs in The Cancer Genome Atlas (TCGA) profiled using the Affymetrix HT-HG-U133A platform.
We created a logic model for systematically analyzing the sources of biological, technical, and mathematical variability inherent in this dataset and in the analytic strategies commonly employed in its interpretation.
We then used standard linear classifiers and linear dimensionality reduction algorithms in conjunction with our logic model to investigate the nature and relative contributions of each factor.
Results: Gene expression data can be used in conjunction with unbiased linear classifiers to distinguish GBMs from other tumors, suggesting that valid biological data is contained in the dataset.
However, the same classifiers fail to segregate GBMs into clinically-relevant molecular subgroups.
Further investigation reveals that commonly-described sources of classification error, including individual sample characteristics, batch effects, and analytic and platform (technical) noise make a measurable but proportionally minor contribution to inaccurate classification.
Instead, our analysis suggests that three, previously underappreciated classes of variability may account for a larger fraction of classification errors: biologic variability (noise) among tumors, of which we describe three types; skewed data distributions incorrectly assumed to be normal; and inherent nonlinear/nonorthogonal relationships among the variables (genes) used in conjunction with classification algorithms that assume linearity.
Conclusions: Technical sources of variability are often assumed to be the primary source of inaccurate molecular classification of GBMs.
Our analysis of the TCGA data suggests a contributory role for these factors, and we believe that additional research in modeling this error is critical to improving classification accuracy.
Notwithstanding, our analysis also suggests that three, rarely-discussed factors, biological variability, abnormal data distribution, and nonlinear relationships among genes, may, together, be responsible for a proportionally larger component of classification error.
Additional research is necessary to better characterize the nature and relative magnitude of each of these effects.
Subsequent efforts can then be made to develop strategies capable of more appropriately identifying and addressing these factors, thereby improving the accuracy and precision of future molecular classifiers for GBM.
Citation Information: Clin Cancer Res 2010;16(7 Suppl):B8.

Related Results

Complex Collision Tumors: A Systematic Review
Complex Collision Tumors: A Systematic Review
Abstract Introduction: A collision tumor consists of two distinct neoplastic components located within the same organ, separated by stromal tissue, without histological intermixing...
Are Cervical Ribs Indicators of Childhood Cancer? A Narrative Review
Are Cervical Ribs Indicators of Childhood Cancer? A Narrative Review
Abstract A cervical rib (CR), also known as a supernumerary or extra rib, is an additional rib that forms above the first rib, resulting from the overgrowth of the transverse proce...
Abstract LB-249: Condroitin sulfate proteoglycan 4 (CSPG4)- redirected T cells eliminate glioblastoma-derived neurospheres
Abstract LB-249: Condroitin sulfate proteoglycan 4 (CSPG4)- redirected T cells eliminate glioblastoma-derived neurospheres
Abstract Chimeric Antigen Receptor-redirected T cells (CAR-Ts) remain challenging for the treatment of glioblastoma (GBM) due to the heterogeneous expression of targ...
Abstract 1260: Tumor hypoxia conditions glioblastoma cells for immunosuppression
Abstract 1260: Tumor hypoxia conditions glioblastoma cells for immunosuppression
Abstract Glioblastoma (GBM) is the most common and lethal malignant brain tumor that invariably recurs after standard therapy, with a median survival of only ~16 mon...
Breast Carcinoma within Fibroadenoma: A Systematic Review
Breast Carcinoma within Fibroadenoma: A Systematic Review
Abstract Introduction Fibroadenoma is the most common benign breast lesion; however, it carries a potential risk of malignant transformation. This systematic review provides an ove...
Giant Sacrococcygeal Teratoma in Infant: Systematic Review
Giant Sacrococcygeal Teratoma in Infant: Systematic Review
Abstract Introduction Sacrococcygeal teratoma (SCT) is a rare embryonal tumor that occurs in the sacrococcygeal region, with an incidence of about 1 in 35,000 to 40,000 live births...

Back to Top