Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Continual Learning of Large Language Models: A Comprehensive Survey

View through CrossRef
The challenge of effectively and efficiently adapting statically pre-trained Large Language Models (LLMs) to ever-evolving data distributions remains predominant. When tailored for specific needs, pre-trained LLMs often suffer from significant performance degradation in previous knowledge domains – a phenomenon known as “catastrophic forgetting” . While extensively studied in the Continual Learning (CL) community, this problem presents new challenges in the context of LLMs. In this survey, we provide a comprehensive overview and detailed discussion of the current research progress on LLMs within the context of CL. Besides the introduction of the preliminary knowledge, this survey is structured into four main sections: we first describe an overview of continually learning LLMs, consisting of two directions of continuity: vertical continuity (or vertical continual learning) , i.e., continual adaptation from general to specific capabilities, and horizontal continuity (or horizontal continual learning) , i.e., continual adaptation across time and domains (Section 3). Following vertical continuity, we summarize three stages of learning LLMs in the context of modern CL: Continual Pre-Training (CPT), Domain-Adaptive Pre-training (DAP), and Continual Fine-Tuning (CFT) (Section 4). We then provide an overview of evaluation protocols for continual learning with LLMs, along with currently available data sources (Section 5). Finally, we discuss intriguing questions related to continual learning for LLMs (Section 6). This survey sheds light on the relatively understudied domain of continually pre-training, adapting, and fine-tuning large language models, suggesting the necessity for greater attention from the community. Key areas requiring immediate focus include the development of practical and accessible evaluation benchmarks, along with methodologies specifically designed to counter forgetting and enable knowledge transfer within the evolving landscape of LLM learning paradigms. The full list of papers examined in this survey is available at https://github.com/Wang-ML-Lab/llm-continual-learning-survey.
Title: Continual Learning of Large Language Models: A Comprehensive Survey
Description:
The challenge of effectively and efficiently adapting statically pre-trained Large Language Models (LLMs) to ever-evolving data distributions remains predominant.
When tailored for specific needs, pre-trained LLMs often suffer from significant performance degradation in previous knowledge domains – a phenomenon known as “catastrophic forgetting” .
While extensively studied in the Continual Learning (CL) community, this problem presents new challenges in the context of LLMs.
In this survey, we provide a comprehensive overview and detailed discussion of the current research progress on LLMs within the context of CL.
Besides the introduction of the preliminary knowledge, this survey is structured into four main sections: we first describe an overview of continually learning LLMs, consisting of two directions of continuity: vertical continuity (or vertical continual learning) , i.
e.
, continual adaptation from general to specific capabilities, and horizontal continuity (or horizontal continual learning) , i.
e.
, continual adaptation across time and domains (Section 3).
Following vertical continuity, we summarize three stages of learning LLMs in the context of modern CL: Continual Pre-Training (CPT), Domain-Adaptive Pre-training (DAP), and Continual Fine-Tuning (CFT) (Section 4).
We then provide an overview of evaluation protocols for continual learning with LLMs, along with currently available data sources (Section 5).
Finally, we discuss intriguing questions related to continual learning for LLMs (Section 6).
This survey sheds light on the relatively understudied domain of continually pre-training, adapting, and fine-tuning large language models, suggesting the necessity for greater attention from the community.
Key areas requiring immediate focus include the development of practical and accessible evaluation benchmarks, along with methodologies specifically designed to counter forgetting and enable knowledge transfer within the evolving landscape of LLM learning paradigms.
The full list of papers examined in this survey is available at https://github.
com/Wang-ML-Lab/llm-continual-learning-survey.

Related Results

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...
Continual Learning: Overcoming Catastrophic Forgetting for Adaptive AI Systems
Continual Learning: Overcoming Catastrophic Forgetting for Adaptive AI Systems
Continual learning is a fundamental challenge in artificial intelligence (AI) that aims to enable models to learn from a continuous stream of data while retaining previously acqui...
Navigating Language Ideologies Through Translanguaging in EAL Classrooms of Pakistan: A Sociolinguistics Perspective
Navigating Language Ideologies Through Translanguaging in EAL Classrooms of Pakistan: A Sociolinguistics Perspective
Language is a tool for instructing and expressing a variety of perspectives. This study aimed to explore the ideologies navigated through translanguaging in Pakistani institutions ...
Investigating the Psychological Impact of Corrective Feedback on ESL Students’ Language Anxiety
Investigating the Psychological Impact of Corrective Feedback on ESL Students’ Language Anxiety
This study investigates the psychological impact of corrective feedback on English as a Second Language (ESL) students' language anxiety using a quantitative research approach. Con...

Back to Top