UniICL: Systematizing Unified Multimodal In-context Learning through a Capability-Oriented Taxonomy
Yicheng Xu, Jiangning Zhang, Zhucun Xue, Teng Hu, Ran Yi, Xiaobin Hu, Yong Liu, Dacheng Tao

TL;DR
This paper introduces a taxonomy and benchmark for understanding and improving multimodal in-context learning, proposing a new module that stabilizes adaptation and outperforms larger models on key tasks.
Contribution
It presents a capability-oriented taxonomy for analyzing multimodal in-context learning and introduces UniICL-760K and UniICL-Bench for systematic evaluation, along with a novel stabilizing module.
Findings
The taxonomy clarifies the functional roles of demonstrations in multimodal tasks.
The proposed module improves stability and performance in few-shot learning.
Our approach outperforms larger models on most understanding tasks.
Abstract
In-context Learning enables training-free adaptation via demonstrations but remains highly sensitive to example selection and formatting. In unified multimodal models spanning understanding and generation, this sensitivity is exacerbated by cross-modal interference and varying cognitive demands. Consequently, In-context Learning efficacy is often non-monotonic and highly task-dependent. To diagnose these behaviors, we introduce a six-level capability-oriented taxonomy that categorizes the functional role of demonstrations from basic perception to high-order discernment. Guided by this cognitive framework, we construct UniICL-760K, a large-scale corpus featuring curated 8-shot In-context Learning episodes across 15 subtasks, alongside UniICL-Bench for rigorous, controlled evaluation. As an architectural intervention to stabilize few-shot adaptation, we propose the Context-Adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
