'No' Matters: Out-of-Distribution Detection in Multimodality Long Dialogue
Rena Gao, Xuetong Wu, Siwen Luo, Caren Han, Feng Liu

TL;DR
This paper presents DIAEF, a novel framework for detecting out-of-distribution inputs in multimodal long dialogues, improving dialogue systems by effectively identifying mismatched or unseen modality pairs.
Contribution
Introduces DIAEF, a new scoring framework that integrates visual language models for robust OOD detection in multimodal long dialogues, addressing mismatches and unseen labels.
Findings
Effective detection of mismatched dialogue-image pairs.
Superior performance in identifying unseen labels across benchmarks.
Robustness in long dialogue scenarios.
Abstract
Out-of-distribution (OOD) detection in multimodal contexts is essential for identifying deviations in combined inputs from different modalities, particularly in applications like open-domain dialogue systems or real-life dialogue interactions. This paper aims to improve the user experience that involves multi-round long dialogues by efficiently detecting OOD dialogues and images. We introduce a novel scoring framework named Dialogue Image Aligning and Enhancing Framework (DIAEF) that integrates the visual language models with the novel proposed scores that detect OOD in two key scenarios (1) mismatches between the dialogue and image input pair and (2) input pairs with previously unseen labels. Our experimental results, derived from various benchmarks, demonstrate that integrating image and multi-round dialogue OOD detection is more effective with previously unseen labels than using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems
