Improving Translation Faithfulness of Large Language Models via   Augmenting Instructions

Yijie Chen; Yijin Liu; Fandong Meng; Yufeng Chen; Jinan Xu; Jie Zhou

arXiv:2308.12674·cs.CL·August 25, 2023·6 cites

Improving Translation Faithfulness of Large Language Models via Augmenting Instructions

Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces SWIE and OVERMISS to enhance translation faithfulness in large language models by improving instruction understanding and reducing instruction forgetting, leading to significant performance gains in translation tasks.

Contribution

The paper proposes novel methods SWIE and OVERMISS that improve instruction understanding and translation faithfulness in LLMs, addressing instruction forgetting and over/under-translation issues.

Findings

01

SWIE significantly improves translation performance, especially in zero-shot and long text scenarios.

02

OVERMISS enhances translation accuracy, increasing BLEU scores and faithfulness metrics.

03

Combining SWIE and OVERMISS yields further performance improvements across different models.

Abstract

Large Language Models (LLMs) present strong general capabilities, and a current compelling challenge is stimulating their specialized capabilities, such as machine translation, through low-cost instruction tuning. The standard instruction-following data is sequentially organized as the concatenation of an instruction, an input, and a response. As the attention mechanism of LLMs has limitations on local focus, LLMs tend to focus more on the words or sentences nearby at each position. This leads to a high risk of instruction forgetting during decoding. To alleviate the above issues, We propose SWIE (Segment-Weighted Instruction Embedding) and an instruction-following dataset OVERMISS. SWIE improves the model instruction understanding by adding a global instruction representation on the following input and response representations. OVERMISS improves model faithfulness by comparing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pppa2019/swie_overmiss_llm4mt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsBLOOM · Focus