Attribute-Grounded Selective Reasoning for Artwork Emotion Understanding with Multimodal Large Language Models

Cheng Zhang; Yuer Liu; Zhiyu Zhou; Hongxia Xie; and Wen-Huang Cheng

arXiv:2605.15755·cs.CV·May 18, 2026

Attribute-Grounded Selective Reasoning for Artwork Emotion Understanding with Multimodal Large Language Models

Cheng Zhang, Yuer Liu, Zhiyu Zhou, Hongxia Xie, and Wen-Huang Cheng

PDF

1 Repo

TL;DR

This paper introduces Attribute-Grounded Selective Reasoning (AGSR) for artwork emotion understanding, leveraging a new annotated dataset and a supervised multi-agent framework to improve interpretability and accuracy of emotional analysis.

Contribution

It proposes a novel AGSR framework with formal-attribute bottleneck-guided reasoning and extends EmoArt with salience annotations for better attribute selection.

Findings

01

FAB-G improves emotion, arousal, and valence prediction accuracy.

02

FAB-G produces more compact and human-aligned explanations.

03

Attribute-grounded salience transferability across datasets is demonstrated.

Abstract

Multimodal large language models (MLLMs) can produce fluent artwork emotion explanations, but they often suffer from attribute flooding: they enumerate many visible formal attributes without identifying which cues actually support the affective judgment. We therefore formulate artwork emotion understanding as Attribute-Grounded Selective Reasoning (AGSR), where predefined formal attributes serve as evidence units and only emotionally operative attributes should enter the final interpretation. To make this problem measurable, we extend EmoArt, originally introduced at ACM MM 2025 as a 132,664-artwork resource with content, formal-attribute, valence-arousal, and emotion annotations, by adding a 1,400-artwork human salience extension annotated by 15 art-trained annotators. This extension provides instance-level supervision for distinguishing attributes that are merely present from those…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://zhiliangzhang.github.io/EmoArt-130k
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.