Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

LifeLong Learning for Large Language Models in Predicting Chemical Reaction Yields

View through CrossRef
Large Language Models (LLMs) based on transformer architectures excel at internet-scale tasks. However, real-world scientific scenarios—such as synthetic chemistry laboratories and autonomous experimental setups—typically involve incremental data generation in batches as new chemical reactions are conducted, unlike static, large-scale datasets. Motivated by scaling laws—which suggest that larger models are increasingly prone to catastrophic forgetting when learning from sequentially arriving data— we present a case study investigating continual learning for chemical reaction yield prediction using Mistral-7B, a 7.3-billion-parameter open-weight LLM. We first establish a baseline by evaluating model performance on the Suzuki Coupling Reactions dataset using both supervised full fine-tuning and Low-Rank Adaptation (LoRA), showing competitive yield prediction accuracy. To mimic real-world conditions, we adopt a task-incremental learning framework in which the model incrementally learns a new task group defined by unique pairs of reactants. This sequential learning setup enables us to directly assess the model’s ability to retain prior knowledge. We demonstrate that when the model is trained within this sequential learning paradigm with traditional procedures, it exhibits significant loss of prior knowledge, a phenomenon known as catastrophic forgetting in the continual learning community—an under-recognized challenge in the chemistry community. To understand this challenge, we first model the problem of continual learning and investigate the source of forgetting, subsequently, we incorporate experience replay to maintain near-baseline performance across tasks without forgetting. These results highlight the importance of integrating continual learning strategies into LLM-based chemical modeling pipelines, particularly as future experimental platforms increasingly generate non-stationary reaction data.
Title: LifeLong Learning for Large Language Models in Predicting Chemical Reaction Yields
Description:
Large Language Models (LLMs) based on transformer architectures excel at internet-scale tasks.
However, real-world scientific scenarios—such as synthetic chemistry laboratories and autonomous experimental setups—typically involve incremental data generation in batches as new chemical reactions are conducted, unlike static, large-scale datasets.
Motivated by scaling laws—which suggest that larger models are increasingly prone to catastrophic forgetting when learning from sequentially arriving data— we present a case study investigating continual learning for chemical reaction yield prediction using Mistral-7B, a 7.
3-billion-parameter open-weight LLM.
We first establish a baseline by evaluating model performance on the Suzuki Coupling Reactions dataset using both supervised full fine-tuning and Low-Rank Adaptation (LoRA), showing competitive yield prediction accuracy.
To mimic real-world conditions, we adopt a task-incremental learning framework in which the model incrementally learns a new task group defined by unique pairs of reactants.
This sequential learning setup enables us to directly assess the model’s ability to retain prior knowledge.
We demonstrate that when the model is trained within this sequential learning paradigm with traditional procedures, it exhibits significant loss of prior knowledge, a phenomenon known as catastrophic forgetting in the continual learning community—an under-recognized challenge in the chemistry community.
To understand this challenge, we first model the problem of continual learning and investigate the source of forgetting, subsequently, we incorporate experience replay to maintain near-baseline performance across tasks without forgetting.
These results highlight the importance of integrating continual learning strategies into LLM-based chemical modeling pipelines, particularly as future experimental platforms increasingly generate non-stationary reaction data.

Related Results

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Isolation, characterization and semi-synthesis of natural products dimeric amide alkaloids
Isolation, characterization and semi-synthesis of natural products dimeric amide alkaloids
 Isolation, characterization of natural products dimeric amide alkaloids from roots of the Piper chaba Hunter. The synthesis of these products using intermolecular [4+2] cycloaddit...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Lifelong Learning in Confucius Philosophical Perspective
Lifelong Learning in Confucius Philosophical Perspective
The research article “Lifelong Learning in Confucius Philosophical Perspective” has three objectives: 1. To study UNESCO's lifelong learning concept. 2. To explore the concept of l...
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
Frequency of Common Chromosomal Abnormalities in Patients with Idiopathic Acquired Aplastic Anemia
Frequency of Common Chromosomal Abnormalities in Patients with Idiopathic Acquired Aplastic Anemia
Objective: To determine the frequency of common chromosomal aberrations in local population idiopathic determine the frequency of common chromosomal aberrations in local population...
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...

Back to Top