Multilingual Extraction and Recognition of Implicit Discourse Relations in Speech and Text
Ahmed Ruby, Christian Hardmeier, Sara Stymne

TL;DR
This paper presents a multilingual, multimodal approach for classifying implicit discourse relations in speech and text, leveraging textual and acoustic data across English, French, and Spanish to improve performance, especially in low-resource languages.
Contribution
It introduces a novel multilingual, multimodal dataset and a joint text-audio classification model for implicit discourse relations across languages.
Findings
Text-based models outperform audio-only models.
Multimodal integration enhances classification accuracy.
Cross-lingual transfer significantly benefits low-resource languages.
Abstract
Implicit discourse relation classification is a challenging task, as it requires inferring meaning from context. While contextual cues can be distributed across modalities and vary across languages, they are not always captured by text alone. To address this, we introduce an automatic method for distantly related and unrelated language pairs to construct a multilingual and multimodal dataset for implicit discourse relations in English, French, and Spanish. For classification, we propose a multimodal approach that integrates textual and acoustic information through Qwen2-Audio, allowing joint modeling of text and audio for implicit discourse relation classification across languages. We find that while text-based models outperform audio-based models, integrating both modalities can enhance performance, and cross-lingual transfer can provide substantial improvements for low-resource…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Emotion and Mood Recognition · Speech and dialogue systems
