Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

MoTSE: an interpretable task similarity estimator for small molecular property prediction tasks

View through CrossRef
AbstractUnderstanding the molecular properties (e.g., physical, chemical or physiological characteristics and biological activities) of small molecules plays essential roles in biomedical researches. The accumulating amount of datasets has enabled the development of data-driven computational methods, especially the machine learning based methods, to address the molecular property prediction tasks. Due to the high cost of obtaining experimental labels, the datasets of individual tasks generally contain limited amount of data, which inspired the application of transfer learning to boost the performance of the molecular property prediction tasks. Our analyses revealed that simultaneously considering similar tasks, rather than randomly chosen ones, can significantly improve the performance of transfer learning in this field. To provide accurate estimation of task similarity, we proposed an effective and interpretable computational tool, named Molecular Tasks Similarity Estimator (MoTSE). By extracting task-related local and global knowledge from pretrained graph neural networks (GNNs), MoTSE projects individual tasks into a latent space and measures the distance between the embedded vectors to derive the task similarity estimation and thus enhance the molecular prediction results. We have validated that the task similarity estimated by MoTSE can serve as a useful guidance to design a more accurate transfer learning strategy for molecular property prediction. Experimental results showed that such a strategy greatly outperformed baseline methods including training from scratch and multitask learning. Moreover, MoTSE can provide interpretability for the estimated task similarity, through visualizing the important loci in the molecules attributed by the attribution method employed in MoTSE. In summary, MoTSE can provide an accurate method for estimating the molecular property task similarity for effective transfer learning, with good interpretability for the learned chemical or biological insights underlying the intrinsic principles of the task similarity.
Title: MoTSE: an interpretable task similarity estimator for small molecular property prediction tasks
Description:
AbstractUnderstanding the molecular properties (e.
g.
, physical, chemical or physiological characteristics and biological activities) of small molecules plays essential roles in biomedical researches.
The accumulating amount of datasets has enabled the development of data-driven computational methods, especially the machine learning based methods, to address the molecular property prediction tasks.
Due to the high cost of obtaining experimental labels, the datasets of individual tasks generally contain limited amount of data, which inspired the application of transfer learning to boost the performance of the molecular property prediction tasks.
Our analyses revealed that simultaneously considering similar tasks, rather than randomly chosen ones, can significantly improve the performance of transfer learning in this field.
To provide accurate estimation of task similarity, we proposed an effective and interpretable computational tool, named Molecular Tasks Similarity Estimator (MoTSE).
By extracting task-related local and global knowledge from pretrained graph neural networks (GNNs), MoTSE projects individual tasks into a latent space and measures the distance between the embedded vectors to derive the task similarity estimation and thus enhance the molecular prediction results.
We have validated that the task similarity estimated by MoTSE can serve as a useful guidance to design a more accurate transfer learning strategy for molecular property prediction.
Experimental results showed that such a strategy greatly outperformed baseline methods including training from scratch and multitask learning.
Moreover, MoTSE can provide interpretability for the estimated task similarity, through visualizing the important loci in the molecules attributed by the attribution method employed in MoTSE.
In summary, MoTSE can provide an accurate method for estimating the molecular property task similarity for effective transfer learning, with good interpretability for the learned chemical or biological insights underlying the intrinsic principles of the task similarity.

Related Results

Generalized Estimator of Population Variance utilizing Auxiliary Information in Simple Random Sampling Scheme
Generalized Estimator of Population Variance utilizing Auxiliary Information in Simple Random Sampling Scheme
In this study, using the Simple Random Sampling without Replacement (SRSWOR) method, we propose a generalized estimator of population variance of the primary variable. Up to the fi...
Decoding task representations that support generalization in hierarchical task
Decoding task representations that support generalization in hierarchical task
AbstractTask knowledge can be encoded hierarchically such that complex tasks can be built by associating simpler tasks. This associative organization supports generalization to fac...
Similarity Search with Data Missing
Similarity Search with Data Missing
Similarity search is a fundamental research problem with broad applications in various research fields, including data mining, information retrieval, and machine learning. The core...
On the Efficiency of the newly Proposed Convex Olanrewaju-Olanrewaju Lo-oλγ(|θ|) Penalized Regression-Type Estimator via GLMs Technique.
On the Efficiency of the newly Proposed Convex Olanrewaju-Olanrewaju Lo-oλγ(|θ|) Penalized Regression-Type Estimator via GLMs Technique.
In this article, we proposed a novel convex penalized regression-type estimator, termed Olanrewaju-Olanrewaju penalized regression-type estimator, denoted by  Lo-oλγ(|θ|) for ultra...
Almost Unbiased Liu Estimator in Bell Regression Model
Almost Unbiased Liu Estimator in Bell Regression Model
Abstract In this research, we propose a novel regression estimator as an alternative to the Liu estimator for addressing multicollinearity in the Bell regression model, ref...
POWER SYSTEM STATE ESTIMATION USING A ROBUST ESTIMATOR
POWER SYSTEM STATE ESTIMATION USING A ROBUST ESTIMATOR
State estimation (SE) is a primary data processing algorithm which is utilised by the control centres of advanced power systems. The most generally utilised state estimator is base...
A New Efficient Difference-Type Estimator for Estimating Population Mean using Dual Auxiliary Information under Non-Response
A New Efficient Difference-Type Estimator for Estimating Population Mean using Dual Auxiliary Information under Non-Response
In this paper, the problem of estimating the finite population mean by using dual auxiliary information under non-response. This paper proposed a difference-type estimator of popul...
A multi‐queue priority‐based task scheduling algorithm in fog computing environment
A multi‐queue priority‐based task scheduling algorithm in fog computing environment
AbstractFog computing is a novel, decentralized and heterogeneous computing environment that extends the traditional cloud computing systems by facilitating task processing near en...

Back to Top