CBSiMT: Mitigating Hallucination in Simultaneous Machine Translation   with Weighted Prefix-to-Prefix Training

Mengge Liu; Wen Zhang; Xiang Li; Yanzhi Tian; Yuhang Guo; Jian Luan,; Bin Wang; Shuoying Chen

arXiv:2311.03672·cs.CL·November 8, 2023·1 cites

CBSiMT: Mitigating Hallucination in Simultaneous Machine Translation with Weighted Prefix-to-Prefix Training

Mengge Liu, Wen Zhang, Xiang Li, Yanzhi Tian, Yuhang Guo, Jian Luan,, Bin Wang, Shuoying Chen

PDF

Open Access

TL;DR

This paper introduces CBSiMT, a confidence-based training framework for simultaneous machine translation that reduces hallucinations caused by misaligned prefix pairs, improving translation accuracy especially at low latency.

Contribution

It proposes a novel confidence-based weighting method in prefix-to-prefix training to mitigate hallucinations in SiMT models, addressing word order differences.

Findings

01

Up to 2 BLEU score improvements at low latency.

02

Consistent translation quality enhancement across multiple language pairs.

03

Effective reduction of hallucination tokens in SiMT tasks.

Abstract

Simultaneous machine translation (SiMT) is a challenging task that requires starting translation before the full source sentence is available. Prefix-to-prefix framework is often applied to SiMT, which learns to predict target tokens using only a partial source prefix. However, due to the word order difference between languages, misaligned prefix pairs would make SiMT models suffer from serious hallucination problems, i.e. target outputs that are unfaithful to source inputs. Such problems can not only produce target tokens that are not supported by the source prefix, but also hinder generating the correct translation by receiving more source words. In this work, we propose a Confidence-Based Simultaneous Machine Translation (CBSiMT) framework, which uses model confidence to perceive hallucination tokens and mitigates their negative impact with weighted prefix-to-prefix training.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification