M3HG: Multimodal, Multi-scale, and Multi-type Node Heterogeneous Graph for Emotion Cause Triplet Extraction in Conversations

Qiao Liang; Ying Shen; Tiantian Chen; Lin Zhang

arXiv:2508.18740·cs.CL·August 27, 2025

M3HG: Multimodal, Multi-scale, and Multi-type Node Heterogeneous Graph for Emotion Cause Triplet Extraction in Conversations

Qiao Liang, Ying Shen, Tiantian Chen, Lin Zhang

PDF

1 Video

TL;DR

This paper introduces M3HG, a novel multimodal heterogeneous graph model for emotion cause triplet extraction in conversations, supported by a new diverse dataset MECAD, addressing previous limitations in modeling emotional and causal contexts.

Contribution

The paper presents M3HG, a new model explicitly capturing emotional and causal contexts and fusing multimodal information at multiple levels, along with MECAD, a diverse dataset for MECTEC.

Findings

01

M3HG outperforms existing methods in emotion cause triplet extraction.

02

MECAD dataset covers 989 conversations from 56 TV series.

03

Explicit modeling of emotional and causal contexts improves performance.

Abstract

Emotion Cause Triplet Extraction in Multimodal Conversations (MECTEC) has recently gained significant attention in social media analysis, aiming to extract emotion utterances, cause utterances, and emotion categories simultaneously. However, the scarcity of related datasets, with only one published dataset featuring highly uniform dialogue scenarios, hinders model development in this field. To address this, we introduce MECAD, the first multimodal, multi-scenario MECTEC dataset, comprising 989 conversations from 56 TV series spanning a wide range of dialogue contexts. In addition, existing MECTEC methods fail to explicitly model emotional and causal contexts and neglect the fusion of semantic information at different levels, leading to performance degradation. In this paper, we propose M3HG, a novel model that explicitly captures emotional and causal contexts and effectively fuses…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

M3HG: Multimodal, Multi-scale, and Multi-type Node Heterogeneous Graph for Emotion Cause Triplet Extraction in Conversations· underline