Improving Translation Faithfulness of Large Language Models via Augmenting Instructions
Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou

TL;DR
This paper introduces SWIE and OVERMISS to enhance translation faithfulness in large language models by improving instruction understanding and reducing instruction forgetting, leading to significant performance gains in translation tasks.
Contribution
The paper proposes novel methods SWIE and OVERMISS that improve instruction understanding and translation faithfulness in LLMs, addressing instruction forgetting and over/under-translation issues.
Findings
SWIE significantly improves translation performance, especially in zero-shot and long text scenarios.
OVERMISS enhances translation accuracy, increasing BLEU scores and faithfulness metrics.
Combining SWIE and OVERMISS yields further performance improvements across different models.
Abstract
Large Language Models (LLMs) present strong general capabilities, and a current compelling challenge is stimulating their specialized capabilities, such as machine translation, through low-cost instruction tuning. The standard instruction-following data is sequentially organized as the concatenation of an instruction, an input, and a response. As the attention mechanism of LLMs has limitations on local focus, LLMs tend to focus more on the words or sentences nearby at each position. This leads to a high risk of instruction forgetting during decoding. To alleviate the above issues, We propose SWIE (Segment-Weighted Instruction Embedding) and an instruction-following dataset OVERMISS. SWIE improves the model instruction understanding by adding a global instruction representation on the following input and response representations. OVERMISS improves model faithfulness by comparing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsBLOOM · Focus
