AMPS: Adaptive Modality Preference Steering via Functional Entropy
Zihan Huang, Xintong Li, Rohan Surana, Tong Yu, Rui Wang, Julian McAuley, Jingbo Shang, Junda Wu

TL;DR
This paper introduces AMPS, a method for dynamically adjusting modality preference in multimodal large language models using an instance-aware metric and learnable scaling, improving control and reducing errors.
Contribution
It presents a novel instance-aware steering approach that adaptively modulates modality preference based on sample-specific information, outperforming uniform steering methods.
Findings
Outperforms traditional uniform steering in controlling modality preference.
Reduces generation error rates while maintaining effective modality adjustment.
Demonstrates improved sample-specific steering effectiveness.
Abstract
Multimodal Large Language Models (MLLMs) often exhibit significant modality preference, which is a tendency to favor one modality over another. Depending on the input, they may over-rely on linguistic priors relative to visual evidence, or conversely over-attend to visually salient but facts in textual contexts. Prior work has applied a uniform steering intensity to adjust the modality preference of MLLMs. However, strong steering can impair standard inference and increase error rates, whereas weak steering is often ineffective. In addition, because steering sensitivity varies substantially across multimodal instances, a single global strength is difficult to calibrate. To address this limitation with minimal disruption to inference, we introduce an instance-aware diagnostic metric that quantifies each modality's information contribution and reveals sample-specific susceptibility to…
Peer Reviews
Decision·Submitted to ICLR 2026
1. The paper introduces the Modality Contribution Score (MCS), grounded in functional entropy and Fisher information, to provide a rigorous quantification of modality contribution at the sample level. 2. Instead of the uniform steering of traditional methods, the authors propose the sample-adaptive steering via a scaling coefficient, justified by the diagnostic metric. The inclusion of the learnable module further enables context-sensitive adjustment. 3. The paper provides a comprehensive evalua
1. The context-aware scaling factor $\gamma$ (Equation 9, Page 6) is constructed as a linear deviation from an anchor ratio, modulated by $\beta$. It seems somewhat heuristic. A more detailed justification for why this specific formula is the right way to quantify "severity of preference" would strengthen the method. 2. The MCS measurement requires multiple forward passes with KV-cache perturbations for a single input. The computational cost of this diagnostic process is not discussed. 3. It’s b
1. The paper is well-motivated: The proposal of the Modality Contribution Score (MCS) based on functional entropy and Fisher information is rigorous and well-motivated. 2. Extensive empirical results: The paper provides comprehensive empirical analysis—including comparisons with prompt-based, static steering, and prior adaptive approaches—across multiple model families (LLaVA, Qwen-VL) and sizes. In Table 1 and Table 2, AMPS shows consistently superior performance for controlling preference whil
1. Experiments on more benchmarks are needed: the experiments are executed with the $M C^{2}$ dataset only, and lack evaluation on broader, more real-world multimodal tasks, such as MME, MM-Vet, LLaVA Bench, and MMstar. 2. To avoid a tendency on one modality, the easiest way is to move the tokens or replace the tokens with pad tokens. Have you tried this strategy? 3. Different models and evaluation benchmarks are mixed across Tables 1 and 3, creating confusion and undermining the interpretabilit
1. This research addresses the valuable and practical task of mitigating modality preference bias in multi-modal large language models (MLLMs), which directly impacts real-world performance and application versatility. 2. The proposed method introduces a novel modality contribution score (MCS) mechanism for adaptive steering, effectively resolving limitations of uniform steering strength through sample-specific sensitivity analysis. The functional entropy is interesting to measure the sensitivit
1. While the use of modality contribution score (MCS) is innovative and interesting, the detailed intuition and theoretical justification can be further enhanced. This lack of background knowledge (especially in Eq. 3-5) may confuse broader readers. Besides, it would be better to provide a more detailed theoretical analysis or evidence supporting MCS. 2. The paper compares AMPS with static steering methods but lacks a comparison to recent approaches in modality preference steering. The baselines
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
