Emotion and Intention Guided Multi-Modal Learning for Sticker Response Selection

Yuxuan Hu; Jian Chen; Yuhao Wang; Zixuan Li; Jing Xiong; Pengyue Jia; Wei Wang; Chengming Li; Xiangyu Zhao

arXiv:2511.17587·cs.LG·November 25, 2025

Emotion and Intention Guided Multi-Modal Learning for Sticker Response Selection

Yuxuan Hu, Jian Chen, Yuhao Wang, Zixuan Li, Jing Xiong, Pengyue Jia, Wei Wang, Chengming Li, Xiangyu Zhao

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel multi-modal learning framework that jointly models emotion and intention to improve sticker response selection in online dialogue, addressing limitations of previous isolated modeling approaches.

Contribution

It proposes the first joint modeling framework for emotion and intention in multi-modal learning, incorporating dual-level contrastive alignment and a progressive fusion module.

Findings

01

Outperforms state-of-the-art methods on two public datasets

02

Achieves higher accuracy in sticker response selection

03

Demonstrates effective integration of emotional and intentional cues

Abstract

Stickers are widely used in online communication to convey emotions and implicit intentions. The Sticker Response Selection (SRS) task aims to select the most contextually appropriate sticker based on the dialogue. However, existing methods typically rely on semantic matching and model emotional and intentional cues separately, which can lead to mismatches when emotions and intentions are misaligned. To address this issue, we propose Emotion and Intention Guided Multi-Modal Learning (EIGML). This framework is the first to jointly model emotion and intention, effectively reducing the bias caused by isolated modeling and significantly improving selection accuracy. Specifically, we introduce Dual-Level Contrastive Framework to perform both intra-modality and inter-modality alignment, ensuring consistent representation of emotional and intentional features within and across modalities. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Emotion and Intention Guided Multi-Modal Learning for Sticker Response Selection· underline

Taxonomy

TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Topic Modeling