Javascript must be enabled to continue!
Efficient Fine-Tuning of Large Language Models via a Low-Rank Gradient Estimator
View through CrossRef
In this paper, we present a Low-Rank Gradient Estimator (LoGE) to accelerate the finetune-time computation of transformers, especially large language models (LLMs). Unlike Parameter-Efficient Fine-Tuning (PEFT) methods, which primarily aim to minimize the number of fine-tuning parameters, LoGE also significantly reduces the computational load of activation gradient calculations by decomposing pre-trained weights and utilizing low-rank matrices during the backward pass. Our approach includes an effective solution for identifying sensitive and important latent subspaces in large models before training with downstream datasets. As LoGE does not alter the network structure, it can be conveniently integrated into existing models. We validated LoGE’s efficacy through comprehensive experiments across various models on various tasks. For the widely used LLaMA model equipped with LoRA, LoGE achieves up to a 1.3× speedup while maintaining graceful accuracy.
Title: Efficient Fine-Tuning of Large Language Models via a Low-Rank Gradient Estimator
Description:
In this paper, we present a Low-Rank Gradient Estimator (LoGE) to accelerate the finetune-time computation of transformers, especially large language models (LLMs).
Unlike Parameter-Efficient Fine-Tuning (PEFT) methods, which primarily aim to minimize the number of fine-tuning parameters, LoGE also significantly reduces the computational load of activation gradient calculations by decomposing pre-trained weights and utilizing low-rank matrices during the backward pass.
Our approach includes an effective solution for identifying sensitive and important latent subspaces in large models before training with downstream datasets.
As LoGE does not alter the network structure, it can be conveniently integrated into existing models.
We validated LoGE’s efficacy through comprehensive experiments across various models on various tasks.
For the widely used LLaMA model equipped with LoRA, LoGE achieves up to a 1.
3× speedup while maintaining graceful accuracy.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Generalized Estimator of Population Variance utilizing Auxiliary Information in Simple Random Sampling Scheme
Generalized Estimator of Population Variance utilizing Auxiliary Information in Simple Random Sampling Scheme
In this study, using the Simple Random Sampling without Replacement (SRSWOR) method, we propose a generalized estimator of population variance of the primary variable. Up to the fi...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Electric field tuning characteristic of multiple optical parametric oscillator based on MgO:QPLN
Electric field tuning characteristic of multiple optical parametric oscillator based on MgO:QPLN
The quasi-phase matching optical parametric oscillator tuning methods, i.e. grating period tuning, temperature tuning, pumping wavelength tuning, and angle tuning are more simple a...
Application of the low-rank adaptation method on the example of fine-tuning a latent diffusion model
Application of the low-rank adaptation method on the example of fine-tuning a latent diffusion model
This article explores the Low-Rank Adaptation (LoRA) method, a fast fine-tuning technique for large-parameter neural networks, and its potential application in various fields, with...
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Increased life expectancy of heart failure patients in a rural center by a multidisciplinary program
Abstract
Funding Acknowledgements
Type of funding sources: None.
INTRODUCTION Patients with heart failure (HF)...
Instruction Tuning on Large Language Models to Improve Reasoning Performance
Instruction Tuning on Large Language Models to Improve Reasoning Performance
The growing demand for natural language processing models capable of understanding and executing complex instructions has driven significant advancements in model fine-tuning tech...
Adaptive Multi-source Domain Collaborative Fine-tuning for Transfer Learning
Adaptive Multi-source Domain Collaborative Fine-tuning for Transfer Learning
Fine-tuning is an important technique in transfer learning that has achieved significant success in tasks that lack training data. However, as it is difficult to extract effective ...

