Contextual Attention Modulation: Towards Efficient Multi-Task Adaptation in Large Language Models

Dayan Pan; Zhaoyang Fu; Jingyuan Wang; Xiao Han; Yue Zhu; Xiangyu Zhao

arXiv:2510.17705·cs.AI·October 21, 2025

Contextual Attention Modulation: Towards Efficient Multi-Task Adaptation in Large Language Models

Dayan Pan, Zhaoyang Fu, Jingyuan Wang, Xiao Han, Yue Zhu, Xiangyu Zhao

PDF

TL;DR

This paper introduces CAM and HyCAM, novel mechanisms for multi-task adaptation in large language models that improve task-specific performance while maintaining general knowledge, with significant experimental validation.

Contribution

We propose CAM and HyCAM, innovative modules and frameworks that enable efficient multi-task adaptation in LLMs by dynamically modulating self-attention representations.

Findings

01

Achieved an average performance improvement of 3.65% across tasks.

02

Outperformed existing multi-task adaptation methods.

03

Demonstrated effectiveness on question answering, code generation, and logical reasoning.

Abstract

Large Language Models (LLMs) possess remarkable generalization capabilities but struggle with multi-task adaptation, particularly in balancing knowledge retention with task-specific specialization. Conventional fine-tuning methods suffer from catastrophic forgetting and substantial resource consumption, while existing parameter-efficient methods perform suboptimally in complex multi-task scenarios. To address this, we propose Contextual Attention Modulation (CAM), a novel mechanism that dynamically modulates the representations of self-attention modules in LLMs. CAM enhances task-specific features while preserving general knowledge, thereby facilitating more effective and efficient adaptation. For effective multi-task adaptation, CAM is integrated into our Hybrid Contextual Attention Modulation (HyCAM) framework, which combines a shared, full-parameter CAM module with multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.