An Empirical Study on Configuring In-Context Learning Demonstrations for Unleashing MLLMs' Sentimental Perception Capability

Daiqing Wu; Dongbao Yang; Sicheng Zhao; Can Ma; Yu Zhou

arXiv:2505.16193·cs.CL·May 23, 2025

An Empirical Study on Configuring In-Context Learning Demonstrations for Unleashing MLLMs' Sentimental Perception Capability

Daiqing Wu, Dongbao Yang, Sicheng Zhao, Can Ma, Yu Zhou

PDF

Open Access

TL;DR

This paper investigates how to optimize in-context learning demonstrations to enhance multimodal large language models' sentiment analysis capabilities, achieving significant accuracy improvements over zero-shot methods.

Contribution

It systematically analyzes and optimizes demonstration retrieval, presentation, and distribution factors for in-context learning in multimodal sentiment analysis, revealing and counteracting predictive biases.

Findings

01

15.9% average accuracy improvement over zero-shot methods

02

Effective strategies for demonstration configuration identified

03

Discovered and mitigated sentiment predictive bias in MLLMs

Abstract

The advancements in Multimodal Large Language Models (MLLMs) have enabled various multimodal tasks to be addressed under a zero-shot paradigm. This paradigm sidesteps the cost of model fine-tuning, emerging as a dominant trend in practical application. Nevertheless, Multimodal Sentiment Analysis (MSA), a pivotal challenge in the quest for general artificial intelligence, fails to accommodate this convenience. The zero-shot paradigm exhibits undesirable performance on MSA, casting doubt on whether MLLMs can perceive sentiments as competent as supervised models. By extending the zero-shot paradigm to In-Context Learning (ICL) and conducting an in-depth study on configuring demonstrations, we validate that MLLMs indeed possess such capability. Specifically, three key factors that cover demonstrations' retrieval, presentation, and distribution are comprehensively investigated and optimized.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Multimodal Machine Learning Applications · Topic Modeling