Javascript must be enabled to continue!
Efficient Adaptation of Pre-trained Models: A Survey of PEFT for Language, Vision, and Multimodal Learning
View through CrossRef
The rapid scaling of pre-trained foundation models in natural language processing (NLP), computer vision (CV), and multimodal learning has led to growing interest in methods that can adapt these large models efficiently without incurring the full computational or storage costs of traditional fine-tuning. Parameter-Efficient Fine-Tuning (PEFT) methods address this challenge by modifying or introducing a small subset of learnable parameters while keeping the majority of the model frozen. In this survey, we present a comprehensive and systematic overview of the landscape of PEFT approaches. We categorize the main families of PEFT methods—including prompt tuning, adapter tuning, low-rank adaptation (e.g., LoRA), BitFit, and sparse updating—providing unified mathematical formulations, detailed comparative analyses, and extensive discussion of their theoretical underpinnings and empirical properties. We also explore implementation considerations, evaluation benchmarks, and real-world applications across language, vision, and multimodal domains. Finally, we highlight open challenges, interpretability gaps, and future research directions in this rapidly evolving field. Our goal is to serve as a foundation for researchers and practitioners seeking to understand, apply, or advance the state of the art in parameter-efficient adaptation of large-scale models.
Title: Efficient Adaptation of Pre-trained Models: A Survey of PEFT for Language, Vision, and Multimodal Learning
Description:
The rapid scaling of pre-trained foundation models in natural language processing (NLP), computer vision (CV), and multimodal learning has led to growing interest in methods that can adapt these large models efficiently without incurring the full computational or storage costs of traditional fine-tuning.
Parameter-Efficient Fine-Tuning (PEFT) methods address this challenge by modifying or introducing a small subset of learnable parameters while keeping the majority of the model frozen.
In this survey, we present a comprehensive and systematic overview of the landscape of PEFT approaches.
We categorize the main families of PEFT methods—including prompt tuning, adapter tuning, low-rank adaptation (e.
g.
, LoRA), BitFit, and sparse updating—providing unified mathematical formulations, detailed comparative analyses, and extensive discussion of their theoretical underpinnings and empirical properties.
We also explore implementation considerations, evaluation benchmarks, and real-world applications across language, vision, and multimodal domains.
Finally, we highlight open challenges, interpretability gaps, and future research directions in this rapidly evolving field.
Our goal is to serve as a foundation for researchers and practitioners seeking to understand, apply, or advance the state of the art in parameter-efficient adaptation of large-scale models.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Revisiting Fine-Tuning: A Survey of Parameter-Efficient Techniques for Large AI Models
Revisiting Fine-Tuning: A Survey of Parameter-Efficient Techniques for Large AI Models
Foundation models have revolutionized artificial intelligence by achieving state-of-the-art performance across a wide range of tasks. However, fine-tuning these massive models for ...
Advances in Parameter-Efficient Fine-Tuning: Optimizing Foundation Models for Scalable AI
Advances in Parameter-Efficient Fine-Tuning: Optimizing Foundation Models for Scalable AI
The unprecedented scale and capabilities of foundation models, such as large language models and vision transformers, have transformed artificial intelligence (AI) across diverse d...
Imagined worldviews in John Lennon’s “Imagine”: a multimodal re-performance / Visões de mundo imaginadas no “Imagine” de John Lennon: uma re-performance multimodal
Imagined worldviews in John Lennon’s “Imagine”: a multimodal re-performance / Visões de mundo imaginadas no “Imagine” de John Lennon: uma re-performance multimodal
Abstract: This paper addresses the issue of multimodal re-performance, a concept developed by us, in view of the fact that the famous song “Imagine”, by John Lennon, was published ...
AFR-BERT: Attention-based mechanism feature relevance fusion multimodal sentiment analysis model
AFR-BERT: Attention-based mechanism feature relevance fusion multimodal sentiment analysis model
Multimodal sentiment analysis is an essential task in natural language processing which refers to the fact that machines can analyze and recognize emotions through logical reasonin...
Depth-aware salient object segmentation
Depth-aware salient object segmentation
Object segmentation is an important task which is widely employed in many computer vision applications such as object detection, tracking, recognition, and ret...
Models and Algorithms for Multimodal Data Processing
Models and Algorithms for Multimodal Data Processing
Information technologies and computer equipment are used in almost all areas of activity, which is why new areas of their use are emerging, and the level of ICT implementation is d...
Aviation English - A global perspective: analysis, teaching, assessment
Aviation English - A global perspective: analysis, teaching, assessment
This e-book brings together 13 chapters written by aviation English researchers and practitioners settled in six different countries, representing institutions and universities fro...

