Javascript must be enabled to continue!

Activation-Guided Layer Selection for LoRA

Low-Rank Adaptation (LoRA) has become a widely adopted parameter-efficient fine-tuning (PEFT) technique for large language models (LLMs). LoRA’s benefits stem from its light weight and modular adapters. Standard LoRA applies adapters uniformly across all Transformer layers, implicitly assuming that each layer contributes equally to task adaptation. However, LLMs are found to have internal substructures that contribute in a disproportionate manner. In this work, we provide a theoretical analysis of how LoRA weight updates are influenced by a layer’s activation magnitude. We propose Act-LoRA, a simple activation-guided layer selection strategy for selective Low-Rank Adaptation. We evaluate this strategy for both encoder-only and decoder-only architectures using the GLUE benchmark. Our method achieved a 20% GPUh saving with a 1% drop in GLUE score using DeBERTaV3-Base on a single-instance GPU with 50% less LoRA parameters. It also achieved 2% GPUh savings with a less than 0.15% drop in GLUE score with the Llama-3.1-8B model in Distributed Data Parallel mode with 25% fewer LoRA parameters. Our experiments and analysis show that the compute and memory requirements of LoRA adapters increase linearly with the number of selected layers. We further compare activation-guided selection against gradient-guided importance metrics and show that activation norms yield more stable and reproducible layer rankings across seeds and datasets. Overall, our results demonstrate that activation-guided layer selection is a practical and effective way to improve the efficiency of LoRA fine-tuning, making it immediately compatible with some existing PEFT techniques and distributed training frameworks.

MDPI AG

Aditya Dawadikar Pooja Shyamsundar Rashmi Vishwanath Bhat Navrati Saxena

Information

2026

Title: Activation-Guided Layer Selection for LoRA

Description:

Low-Rank Adaptation (LoRA) has become a widely adopted parameter-efficient fine-tuning (PEFT) technique for large language models (LLMs).

LoRA’s benefits stem from its light weight and modular adapters.

Standard LoRA applies adapters uniformly across all Transformer layers, implicitly assuming that each layer contributes equally to task adaptation.

However, LLMs are found to have internal substructures that contribute in a disproportionate manner.

In this work, we provide a theoretical analysis of how LoRA weight updates are influenced by a layer’s activation magnitude.

We propose Act-LoRA, a simple activation-guided layer selection strategy for selective Low-Rank Adaptation.

We evaluate this strategy for both encoder-only and decoder-only architectures using the GLUE benchmark.

Our method achieved a 20% GPUh saving with a 1% drop in GLUE score using DeBERTaV3-Base on a single-instance GPU with 50% less LoRA parameters.

It also achieved 2% GPUh savings with a less than 0.

15% drop in GLUE score with the Llama-3.

1-8B model in Distributed Data Parallel mode with 25% fewer LoRA parameters.

Our experiments and analysis show that the compute and memory requirements of LoRA adapters increase linearly with the number of selected layers.

We further compare activation-guided selection against gradient-guided importance metrics and show that activation norms yield more stable and reproducible layer rankings across seeds and datasets.

Overall, our results demonstrate that activation-guided layer selection is a practical and effective way to improve the efficiency of LoRA fine-tuning, making it immediately compatible with some existing PEFT techniques and distributed training frameworks.

Back

It is well known that rheumatoid arthritis (RA) has an increased incidence in young and middle-aged adults. When this disease begins in adults between the ages of 60 and 65, it is ...

Application of the low-rank adaptation method on the example of fine-tuning a latent diffusion model

This article explores the Low-Rank Adaptation (LoRA) method, a fast fine-tuning technique for large-parameter neural networks, and its potential application in various fields, with...

Assessment of the Applicability of Lora Technology in Smart Metering

Рассмотрен вопрос применения технологии Lora для взаимодействия с современными электросчетчиками. Даны предпосылки применения, компоненты и возможности технологии Lora. Описана тес...

Jamming of LoRa PHY and Countermeasure

LoRaWAN forms a one-hop star topology where LoRa nodes send data via one-hop uplink transmission to a LoRa gateway. If the LoRa gateway can be jammed by attackers, it may not be ab...

Selection Gradients

Natural selection and sexual selection are important evolutionary processes that can shape the phenotypic distributions of natural populations and, consequently, a primary goal of ...

Poems

poems selection poems selection poems selection poems selection poems selection poems selection poems selection poems selection poems selection poems selection poems selection poem...

Detectability of an intermediate layer by magnetotelluric sounding

Abstract The recent publication by Verma and Mallick (1979) on the detectability of an intermediate layer by time domain EM sounding provides some informative ans...

Evaluasi dan Analisis Kinerja LoRa Pada Sistem Irigasi Pertanian Berbasis IoT

Komunikasi Long Range (LoRa) merupakan salah satu teknologi Internet of Things (IoT) yang sedang naik daun dan banyak didiskusikan oleh para peneliti. LoRa juga adalah bagian dari ...

Email:
Password:

Email:

Activation-Guided Layer Selection for LoRA

Related Results