Mechanisms are Transferable: Data-Efficient Low-Resource Adaptation via Circuit-Targeted Supervised Fine-Tuning

Khumaisa Nur'aini; Ayu Purwarianti; Alham Fikri Aji; Derry Wijaya

arXiv:2601.08146·cs.CL·January 21, 2026

Mechanisms are Transferable: Data-Efficient Low-Resource Adaptation via Circuit-Targeted Supervised Fine-Tuning

Khumaisa Nur'aini, Ayu Purwarianti, Alham Fikri Aji, Derry Wijaya

PDF

Open Access

TL;DR

This paper introduces CT-SFT, a method for low-resource language adaptation of large language models that selectively fine-tunes task-relevant attention heads, improving cross-lingual accuracy and reducing catastrophic forgetting.

Contribution

The paper proposes Circuit-Targeted Supervised Fine-Tuning (CT-SFT), a novel head-level parameter updating method that enhances low-resource language adaptation while preserving source language knowledge.

Findings

01

CT-SFT outperforms full fine-tuning in cross-lingual tasks.

02

Selective head updating reduces catastrophic forgetting.

03

Trade-off observed between circuit head editing and low-relevance head preservation.

Abstract

Adapting LLMs to low-resource languages is difficult: labeled data is scarce, full-model fine-tuning is unstable, and continued cross-lingual tuning can cause catastrophic forgetting. We propose Circuit-Targeted Supervised Fine-Tuning (CT-SFT): a counterfactual-free adaptation of CD-T (Contextual Decomposition Transformer) that uses a label-balanced mean baseline and task-directional relevance scoring to identify a sparse set of task-relevant attention heads in a proxy-language checkpoint, then transfer learns to a target language by updating only those heads (plus LayerNorm) via head-level gradient masking. Across NusaX-Senti and XNLI, CT-SFT improves cross-lingual accuracy over continued full fine-tuning while updating only a small subset of model parameters. We find an editing-preserving trade-off: harder transfers favor editing circuit heads, while easier transfers often favor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification