Friends-MMC: A Dataset for Multi-modal Multi-party Conversation Understanding
Yueqian Wang, Xiaojun Meng, Yuxuan Wang, Jianxin Liang, Qun Liu,, Dongyan Zhao

TL;DR
This paper introduces Friends-MMC, a multi-modal multi-party conversation dataset with over 24,000 utterances, and studies fundamental tasks like speaker identification and response prediction, emphasizing the importance of character-centered understanding.
Contribution
The paper presents a new dataset for MMC with detailed annotations and explores baseline methods for speaker identification and response prediction, highlighting the significance of speaker information.
Findings
Existing methods are inefficient for speaker identification.
A simple baseline with an optimization solver improves performance.
Fine-tuning models benefits from speaker information.
Abstract
Multi-modal multi-party conversation (MMC) is a less studied yet important topic of research due to that it well fits real-world scenarios and thus potentially has more widely-used applications. Compared with the traditional multi-modal conversations, MMC requires stronger character-centered understanding abilities as there are many interlocutors appearing in both the visual and textual context. To facilitate the study of this problem, we present Friends-MMC in this paper, an MMC dataset that contains 24,000+ unique utterances paired with video context. To explore the character-centered understanding of the dialogue, we also annotate the speaker of each utterance, the names and bounding bboxes of faces that appear in the video. Based on this Friends-MMC dataset, we further study two fundamental MMC tasks: conversation speaker identification and conversation response prediction, both of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpeech and dialogue systems · Language, Metaphor, and Cognition · Interpreting and Communication in Healthcare
MethodsSoftmax · Attention Is All You Need
