MELD-ST: An Emotion-aware Speech Translation Dataset
Sirou Chen, Sakiko Yahata, Shuichiro Shimizu, Zhengdong Yang, Yihang, Li, Chenhui Chu, Sadao Kurohashi

TL;DR
This paper introduces the MELD-ST dataset for emotion-aware speech translation, emphasizing the importance of emotion in translation quality and providing a new resource for developing more emotionally intelligent translation systems.
Contribution
The paper presents the MELD-ST dataset, a new emotion-annotated speech translation dataset for English-Japanese and English-German, enabling research on emotion-aware translation.
Findings
Fine-tuning with emotion labels can improve translation performance.
Baseline experiments demonstrate the dataset's potential for emotion-aware translation.
Highlighting the need for further research in emotion-aware speech translation.
Abstract
Emotion plays a crucial role in human conversation. This paper underscores the significance of considering emotion in speech translation. We present the MELD-ST dataset for the emotion-aware speech translation task, comprising English-to-Japanese and English-to-German language pairs. Each language pair includes about 10,000 utterances annotated with emotion labels from the MELD dataset. Baseline experiments using the SeamlessM4T model on the dataset indicate that fine-tuning with emotion labels can enhance translation performance in some settings, highlighting the need for further research in emotion-aware speech translation systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques
