EmoCaliber: Advancing Reliable Visual Emotion Comprehension via Confidence Verbalization and Calibration

Daiqing Wu; Dongbao Yang; Can Ma; Yu Zhou

arXiv:2512.15528·cs.CV·December 22, 2025

EmoCaliber: Advancing Reliable Visual Emotion Comprehension via Confidence Verbalization and Calibration

Daiqing Wu, Dongbao Yang, Can Ma, Yu Zhou

PDF

Open Access 1 Models

TL;DR

EmoCaliber enhances visual emotion comprehension by enabling multimodal models to verbalize and calibrate their confidence, addressing subjectivity and improving reliability in emotion prediction from images.

Contribution

The paper introduces a novel three-stage training framework for MLLMs that incorporates confidence verbalization and calibration in visual emotion comprehension tasks.

Findings

01

Outperforms existing methods in emotion prediction accuracy.

02

Provides reliable confidence estimates alongside predictions.

03

Demonstrates improved calibration of confidence scores.

Abstract

Visual Emotion Comprehension (VEC) aims to infer sentiment polarities or emotion categories from affective cues embedded in images. In recent years, Multimodal Large Language Models (MLLMs) have established a popular paradigm in VEC, leveraging their generalizability to unify VEC tasks defined under diverse emotion taxonomies. While this paradigm achieves notable success, it typically formulates VEC as a deterministic task, requiring the model to output a single, definitive emotion label for each image. Such a formulation insufficiently accounts for the inherent subjectivity of emotion perception, overlooking alternative interpretations that may be equally plausible to different viewers. To address this limitation, we propose equipping MLLMs with capabilities to verbalize their confidence in emotion predictions. This additional signal provides users with an estimate of both the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
wudq/EmoCaliber
model· 7 dl· ♡ 1
7 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Emotion and Mood Recognition · Sentiment Analysis and Opinion Mining