Contextual Attention Modulation: Towards Efficient Multi-Task Adaptation in Large Language Models
Dayan Pan, Zhaoyang Fu, Jingyuan Wang, Xiao Han, Yue Zhu, Xiangyu Zhao

TL;DR
This paper introduces CAM and HyCAM, novel mechanisms for multi-task adaptation in large language models that improve task-specific performance while maintaining general knowledge, with significant experimental validation.
Contribution
We propose CAM and HyCAM, innovative modules and frameworks that enable efficient multi-task adaptation in LLMs by dynamically modulating self-attention representations.
Findings
Achieved an average performance improvement of 3.65% across tasks.
Outperformed existing multi-task adaptation methods.
Demonstrated effectiveness on question answering, code generation, and logical reasoning.
Abstract
Large Language Models (LLMs) possess remarkable generalization capabilities but struggle with multi-task adaptation, particularly in balancing knowledge retention with task-specific specialization. Conventional fine-tuning methods suffer from catastrophic forgetting and substantial resource consumption, while existing parameter-efficient methods perform suboptimally in complex multi-task scenarios. To address this, we propose Contextual Attention Modulation (CAM), a novel mechanism that dynamically modulates the representations of self-attention modules in LLMs. CAM enhances task-specific features while preserving general knowledge, thereby facilitating more effective and efficient adaptation. For effective multi-task adaptation, CAM is integrated into our Hybrid Contextual Attention Modulation (HyCAM) framework, which combines a shared, full-parameter CAM module with multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
