Talk, Snap, Complain: Validation-Aware Multimodal Expert Framework for Fine-Grained Customer Grievances

Rishu Kumar Singh; Navneet Shreya; Sarmistha Das; Apoorva Singh; Sriparna Saha

arXiv:2511.14693·cs.CL·November 19, 2025

Talk, Snap, Complain: Validation-Aware Multimodal Expert Framework for Fine-Grained Customer Grievances

Rishu Kumar Singh, Navneet Shreya, Sarmistha Das, Apoorva Singh, Sriparna Saha

PDF

Open Access 1 Video

TL;DR

This paper introduces VALOR, a multimodal, validation-aware framework that improves fine-grained customer complaint classification by integrating textual and visual data with expert reasoning, enhancing robustness and contextual understanding.

Contribution

The paper presents VALOR, a novel multimodal, validation-aware learning framework with expert routing and semantic alignment, advancing complaint analysis beyond unimodal approaches.

Findings

01

VALOR outperforms baseline models on a curated complaint dataset.

02

It effectively handles complex scenarios with distributed multimodal information.

03

The framework supports sustainable development goals related to industry and responsible consumption.

Abstract

Existing approaches to complaint analysis largely rely on unimodal, short-form content such as tweets or product reviews. This work advances the field by leveraging multimodal, multi-turn customer support dialogues, where users often share both textual complaints and visual evidence (e.g., screenshots, product photos) to enable fine-grained classification of complaint aspects and severity. We introduce VALOR, a Validation-Aware Learner with Expert Routing, tailored for this multimodal setting. It employs a multi-expert reasoning setup using large-scale generative models with Chain-of-Thought (CoT) prompting for nuanced decision-making. To ensure coherence between modalities, a semantic alignment score is computed and integrated into the final classification through a meta-fusion strategy. In alignment with the United Nations Sustainable Development Goals (UN SDGs), the proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Talk, Snap, Complain: Validation-Aware Multimodal Expert Framework for Fine-Grained Customer Grievances· underline

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Public Relations and Crisis Communication · Explainable Artificial Intelligence (XAI)