On the Hallucination in Simultaneous Machine Translation
Meizhi Zhong, Kehai Chen, Zhengshan Xue, Lemao Liu, Mingming Yang, Min, Zhang

TL;DR
This paper analyzes hallucination issues in Simultaneous Machine Translation, revealing how target-side information influences hallucination and suggesting that reducing reliance on target context can mitigate the problem.
Contribution
It provides a comprehensive analysis of hallucination in SiMT, focusing on distribution and target-side context, and proposes a method to alleviate hallucination by limiting target information usage.
Findings
Hallucination words are influenced by target-side context.
Reducing target-side information decreases hallucination.
Understanding hallucination distribution aids in developing better SiMT models.
Abstract
It is widely known that hallucination is a critical issue in Simultaneous Machine Translation (SiMT) due to the absence of source-side information. While many efforts have been made to enhance performance for SiMT, few of them attempt to understand and analyze hallucination in SiMT. Therefore, we conduct a comprehensive analysis of hallucination in SiMT from two perspectives: understanding the distribution of hallucination words and the target-side context usage of them. Intensive experiments demonstrate some valuable findings and particularly show that it is possible to alleviate hallucination by decreasing the over usage of target-side information for SiMT.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Biomedical Text Mining and Ontologies · Algorithms and Data Compression
