Joint Localization and Activation Editing for Low-Resource Fine-Tuning

Wen Lai; Alexander Fraser; Ivan Titov

arXiv:2502.01179·cs.CL·May 30, 2025

Joint Localization and Activation Editing for Low-Resource Fine-Tuning

Wen Lai, Alexander Fraser, Ivan Titov

PDF

Open Access 1 Repo 1 Video

TL;DR

JoLA is a novel method that jointly learns which model components to edit and how to edit them, significantly improving low-resource fine-tuning of large language models across various tasks.

Contribution

The paper introduces JoLA, a joint learning approach for localization and activation editing, enhancing stability and performance in low-resource scenarios.

Findings

01

JoLA outperforms existing methods on three benchmarks.

02

It effectively identifies relevant model modules for editing.

03

JoLA improves task performance in low-data settings.

Abstract

Parameter-efficient fine-tuning (PEFT) methods, such as LoRA, are commonly used to adapt LLMs. However, the effectiveness of standard PEFT methods is limited in low-resource scenarios with only a few hundred examples. Recent advances in interpretability research have inspired the emergence of activation editing (or steering) techniques, which modify the activations of specific model components. Due to their extremely small parameter counts, these methods show promise for small datasets. However, their performance is highly dependent on identifying the correct modules to edit and often lacks stability across different datasets. In this paper, we propose Joint Localization and Activation Editing (JoLA), a method that jointly learns (1) which heads in the Transformer to edit (2) whether the intervention should be additive, multiplicative, or both and (3) the intervention parameters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wenlai-lavine/jola
pytorchOfficial

Videos

Joint Localization and Activation Editing for Low-Resource Fine-Tuning· slideslive

Taxonomy

TopicsModular Robots and Swarm Intelligence · Parallel Computing and Optimization Techniques · Advancements in Photolithography Techniques

MethodsAttention Is All You Need · Absolute Position Encodings · Dense Connections · Linear Layer · Layer Normalization · Byte Pair Encoding · Residual Connection · Label Smoothing · Multi-Head Attention · Position-Wise Feed-Forward Layer