Javascript must be enabled to continue!
The AABBA Graph Kernel: Atom–Atom, Bond–Bond, and Bond–Atom Autocorrelations for Machine Learning
View through CrossRef
Graphs are one of the most natural and powerful representations available for molecules; natural because they have an intuitive correspondence to skeletal formulas, the language used by chemists worldwide, and powerful, because they are highly expressive both globally (molecular topology) and locally (atom and bond properties). Graph kernels are used to transform molecular graphs into fixed-length vectors, which, based on their capacity of measuring similarity, can be used as fingerprints for machine learning (ML). To date, graph kernels have mostly focused on the atomic nodes of the graph. In this work, we developed a graph kernel based on atom–atom, bond–bond, and bond–atom (AABBA) autocorrelations. The resulting vector representations were tested on regression ML tasks on a dataset of transition metal complexes; a benchmark motivated by the higher complexity of these compounds relative to organic molecules. In particular, we tested different flavors of the AABBA kernel in the prediction of the energy barriers and bond distances of the Vaska’s complex dataset (Friederich et al., Chem. Sci., 2020, 11, 4584). For a variety of ML models, including neural net- works, gradient boosting machines, and Gaussian processes, we showed that AABBA outperforms the baseline including only atom–atom autocorrelations. Dimensionality reduction studies also showed that the bond–bond and bond–atom autocorrelations yield many of the most relevant features. We believe that the AABBA graph kernel can accelerate the exploration of large chemical spaces and inspire novel molecular representations in which both atomic and bond properties play an important role.
American Chemical Society (ACS)
Title: The AABBA Graph Kernel: Atom–Atom, Bond–Bond, and Bond–Atom Autocorrelations for Machine Learning
Description:
Graphs are one of the most natural and powerful representations available for molecules; natural because they have an intuitive correspondence to skeletal formulas, the language used by chemists worldwide, and powerful, because they are highly expressive both globally (molecular topology) and locally (atom and bond properties).
Graph kernels are used to transform molecular graphs into fixed-length vectors, which, based on their capacity of measuring similarity, can be used as fingerprints for machine learning (ML).
To date, graph kernels have mostly focused on the atomic nodes of the graph.
In this work, we developed a graph kernel based on atom–atom, bond–bond, and bond–atom (AABBA) autocorrelations.
The resulting vector representations were tested on regression ML tasks on a dataset of transition metal complexes; a benchmark motivated by the higher complexity of these compounds relative to organic molecules.
In particular, we tested different flavors of the AABBA kernel in the prediction of the energy barriers and bond distances of the Vaska’s complex dataset (Friederich et al.
, Chem.
Sci.
, 2020, 11, 4584).
For a variety of ML models, including neural net- works, gradient boosting machines, and Gaussian processes, we showed that AABBA outperforms the baseline including only atom–atom autocorrelations.
Dimensionality reduction studies also showed that the bond–bond and bond–atom autocorrelations yield many of the most relevant features.
We believe that the AABBA graph kernel can accelerate the exploration of large chemical spaces and inspire novel molecular representations in which both atomic and bond properties play an important role.
Related Results
AABBA: Atom–Atom Bond–Bond Bond–Atom Graph Kernel for Machine Learning on Molecules and Materials
AABBA: Atom–Atom Bond–Bond Bond–Atom Graph Kernel for Machine Learning on Molecules and Materials
Graphs are one of the most natural and powerful representations available for molecules; natural because they have an intuitive correspondence to skeletal formulas, the language us...
Machine Learning the Hydrogen Adsorption Capacity of Metal Organic Frameworks
Machine Learning the Hydrogen Adsorption Capacity of Metal Organic Frameworks
High-throughput virtual screening and machine learning (ML) are powerful tools for accelerating the discovery of nanoporous adsorbents for gas storage applications, including metal...
Physicochemical Properties of Wheat Fractionated by Wheat Kernel Thickness and Separated by Kernel Specific Density
Physicochemical Properties of Wheat Fractionated by Wheat Kernel Thickness and Separated by Kernel Specific Density
ABSTRACTTwo wheat cultivars, soft white winter wheat Yang‐mai 11 and hard white winter wheat Zheng‐mai 9023, were fractionated by kernel thickness into five sections; the fractiona...
Genetic Variation in Potential Kernel Size Affects Kernel Growth and Yield of Sorghum
Genetic Variation in Potential Kernel Size Affects Kernel Growth and Yield of Sorghum
Large‐seededness can increase grain yield in sorghum [Sorghum bicolor (L.) Moench] if larger kernel size more than compensates for the associated reduction in kernel number. The ai...
Sorghum Kernel Weight
Sorghum Kernel Weight
The influence of genotype and panicle position on sorghum [Sorghum bicolor (L.) Moench] kernel growth is poorly understood. In the present study, sorghum kernel weight (KW) differe...
Isolation Graph Kernel
Isolation Graph Kernel
A recent Wasserstein Weisfeiler-Lehman (WWL) Graph Kernel has a distinctive feature: Representing the distribution of Weisfeiler-Lehman (WL)-embedded node vectors of a graph in a h...
Abstract 902: Explainable AI: Graph machine learning for response prediction and biomarker discovery
Abstract 902: Explainable AI: Graph machine learning for response prediction and biomarker discovery
Abstract
Accurately predicting drug sensitivity and understanding what is driving it are major challenges in drug discovery. Graphs are a natural framework for captu...
Domination of Polynomial with Application
Domination of Polynomial with Application
In this paper, .We .initiate the study of domination. polynomial , consider G=(V,E) be a simple, finite, and directed graph without. isolated. vertex .We present a study of the Ira...


