Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization
Zilu Tang, Rajen Chatterjee, Sarthak Garg

TL;DR
This paper presents a training method for large language models in machine translation that significantly reduces hallucinations during translation, improving trustworthiness without adding extra post-processing steps.
Contribution
The authors introduce a novel data creation framework for hallucination-focused preferences and demonstrate its effectiveness in reducing hallucinations during LLM-based translation.
Findings
Hallucination rate reduced by 96% across five language pairs.
89% reduction in hallucinations in zero-shot translation for unseen languages.
Translation quality preserved despite hallucination mitigation.
Abstract
Machine Translation (MT) is undergoing a paradigm shift, with systems based on fine-tuned large language models (LLM) becoming increasingly competitive with traditional encoder-decoder models trained specifically for translation tasks. However, LLM-based systems are at a higher risk of generating hallucinations, which can severely undermine user's trust and safety. Most prior research on hallucination mitigation focuses on traditional MT models, with solutions that involve post-hoc mitigation - detecting hallucinated translations and re-translating them. While effective, this approach introduces additional complexity in deploying extra tools in production and also increases latency. To address these limitations, we propose a method that intrinsically learns to mitigate hallucinations during the model training phase. Specifically, we introduce a data creation framework to generate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopological and Geometric Data Analysis
