Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Applying gradient tree boosting to QTL mapping with Shapley additive explanations

View through CrossRef
Abstract Mapping quantitative trait loci (QTLs) is one of the major goals of quantitative genetics; however, identifying the interactions between QTLs (i.e., epistasis) remains challenging. Recently developed machine learning methods, such as deep learning and gradient boosting, are transforming the real world. These methods could advance QTL mapping methodologies because of their high capability for capturing complex relationships among features. One problem with applying such complex models to QTL mapping is the evaluation of feature importance. In this study, XGBoost, a popular gradient tree boosting algorithm, was applied for QTL mapping in biparental populations with Shapley additive explanations (SHAPs). SHAP is a local (i.e., instance-wise) importance index with the desired properties as feature importance indices. The SHAP-assisted XGBoost (SHAP-XGB) was compared with conventional methods, including composite interval mapping (CIM), multiple interval mapping (MIM), inclusive CIM (ICIM), and BayesC, using simulations and rice heading date data. SHAP-XGB performed comparablely to CIM, MIM, ICIM, and BayesC in mapping main QTL effects and was superior to MIM, ICIM, and BayesC in mapping QTL interaction effects. As SHAP can evaluate local importance, interactions between markers can be visualized by plotting SHAP interaction values for each instance (plant/line). These results illustrated the strength of SHAP-XGB in detecting and interpreting epistatic QTLs and suggest the possibility that SHAP-XGB complements conventional methods.
Cold Spring Harbor Laboratory
Title: Applying gradient tree boosting to QTL mapping with Shapley additive explanations
Description:
Abstract Mapping quantitative trait loci (QTLs) is one of the major goals of quantitative genetics; however, identifying the interactions between QTLs (i.
e.
, epistasis) remains challenging.
Recently developed machine learning methods, such as deep learning and gradient boosting, are transforming the real world.
These methods could advance QTL mapping methodologies because of their high capability for capturing complex relationships among features.
One problem with applying such complex models to QTL mapping is the evaluation of feature importance.
In this study, XGBoost, a popular gradient tree boosting algorithm, was applied for QTL mapping in biparental populations with Shapley additive explanations (SHAPs).
SHAP is a local (i.
e.
, instance-wise) importance index with the desired properties as feature importance indices.
The SHAP-assisted XGBoost (SHAP-XGB) was compared with conventional methods, including composite interval mapping (CIM), multiple interval mapping (MIM), inclusive CIM (ICIM), and BayesC, using simulations and rice heading date data.
SHAP-XGB performed comparablely to CIM, MIM, ICIM, and BayesC in mapping main QTL effects and was superior to MIM, ICIM, and BayesC in mapping QTL interaction effects.
As SHAP can evaluate local importance, interactions between markers can be visualized by plotting SHAP interaction values for each instance (plant/line).
These results illustrated the strength of SHAP-XGB in detecting and interpreting epistatic QTLs and suggest the possibility that SHAP-XGB complements conventional methods.

Related Results

Development of doubled haploid population and QTL mapping for Fusarium stalk rot (FSR) resistance in tropical maize
Development of doubled haploid population and QTL mapping for Fusarium stalk rot (FSR) resistance in tropical maize
Abstract Fusarium stalk rot disease (FSR) caused by Fusarium verticilloides is emerging as the major production constraint in maize across the world. As a prelude to develo...
Mapping of QTL for resistance to fusarium stalk rot (FSR) in tropical maize (Zea mays L.)
Mapping of QTL for resistance to fusarium stalk rot (FSR) in tropical maize (Zea mays L.)
Fusarium stalk rot disease (FSR) caused by Fusarium verticilloides is emerging as the major production constraint in maize across theworld.As a prelude to developing maize hybrids ...
QTL and Candidate Genes: Techniques and Advancement in Abiotic Stress Resistance Breeding of Major Cereals
QTL and Candidate Genes: Techniques and Advancement in Abiotic Stress Resistance Breeding of Major Cereals
At least 75% of the world’s grain production comes from the three most important cereal crops: rice (Oryza sativa), wheat (Triticum aestivum), and maize (Zea mays). However, abioti...
Costly Resistance to Parasitism
Costly Resistance to Parasitism
Abstract Information on the molecular basis of resistance and the evolution of resistance is crucial to an understanding of the appearance, spread, and distribution ...
A Theory of Heterosis
A Theory of Heterosis
AbstractHeterosis refers to the superior performance of a hybrid over its parents. It is the basis for hybrid breeding particularly for maize and rice. Genetically it is due to int...
Detection of Quantitative Trait Loci (QTL) associated with the spring regrowth vigor trait in alfalfa (Medicago sativa L.)
Detection of Quantitative Trait Loci (QTL) associated with the spring regrowth vigor trait in alfalfa (Medicago sativa L.)
Abstract Background: Alfalfa ( Medicago sativa L.) is a perennial forage legume with a reputation as being the “queen of forage”. Spring regrowth vigor refers to the proces...
Supplementary Python Jupyter and R/qtl Notebooks
Supplementary Python Jupyter and R/qtl Notebooks
Part 2: Supplementary Python Jupyter and R/qtl Notebooks is the essential companion to the main volume QTL Mapping with Python and R/qtl: A Reproducible Pipeline for Crop Genetics....
Genetics
Genetics
We present a new flexible, simple, and power ful genome-scan method (flexible intercross analysis, FIA) for detecting quantitative trait loci (QTL) in experimental line crosses. The...

Back to Top