Text-Guided Multi-Scale Frequency Representation Adaptation

Weicai Yan; Xinhua Ma; Wang Lin; Tao Jin

arXiv:2605.08181·cs.CV·May 12, 2026

Text-Guided Multi-Scale Frequency Representation Adaptation

Weicai Yan, Xinhua Ma, Wang Lin, Tao Jin

PDF

1 Repo

TL;DR

This paper introduces FreqAdapter, a multi-scale frequency domain adaptation method that enhances multimodal model performance and efficiency by integrating textual info and optimizing frequency receptive fields.

Contribution

It proposes a novel multi-scale frequency adaptation strategy for signals, addressing redundancy and fixed prompt limitations in existing parameter-efficient fine-tuning methods.

Findings

01

FreqAdapter significantly improves performance on multimodal models.

02

It achieves fast convergence within one epoch.

03

The method enhances efficiency with minimal additional cost.

Abstract

Parameter-efficient fine-tuning methods introduce a small number of training parameters, enabling pre-trained models to adapt rapidly to new data distributions. While these methods have shown promising results, they exhibit notable limitations. First, most existing methods operate in the signal space domain, which results in substantial information redundancy. Second, most existing methods utilize fixed prompts or adaptation layers, failing to fully account for the multi-scale characteristics of signals. To address these challenges, we propose the Multi-Scale Frequency Adapter (FreqAdapter), which integrates textual information and performs multi-scale fine-tuning of signals in the frequency domain. Additionally, we introduce a multi-scale adaptation strategy to optimize receptive fields across different frequency ranges, further enhancing the model's representational capacity.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Kelvin-ywc/FreqAdapter
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.