Mitigating the Language Mismatch and Repetition Issues in LLM-based   Machine Translation via Model Editing

Weichuan Wang; Zhaoyi Li; Defu Lian; Chen Ma; Linqi Song; Ying Wei

arXiv:2410.07054·cs.CL·October 10, 2024

Mitigating the Language Mismatch and Repetition Issues in LLM-based Machine Translation via Model Editing

Weichuan Wang, Zhaoyi Li, Defu Lian, Chen Ma, Linqi Song, Ying Wei

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates mitigating language mismatch and repetition errors in LLM-based machine translation by model editing, proposing a refined method that improves translation accuracy without harming overall quality.

Contribution

It introduces a novel model editing approach that refines error-related components in LLMs, effectively reducing specific translation errors while preserving general performance.

Findings

01

Significant reduction in language mismatch errors

02

Effective decrease in repetition issues

03

Maintained or improved overall translation quality

Abstract

Large Language Models (LLMs) have recently revolutionized the NLP field, while they still fall short in some specific down-stream tasks. In the work, we focus on utilizing LLMs to perform machine translation, where we observe that two patterns of errors frequently occur and drastically affect the translation quality: language mismatch and repetition. The work sets out to explore the potential for mitigating these two issues by leveraging model editing methods, e.g., by locating Feed-Forward Network (FFN) neurons or something that are responsible for the errors and deactivating them in the inference time. We find that directly applying such methods either limited effect on the targeted errors or has significant negative side-effect on the general translation quality, indicating that the located components may also be crucial for ensuring machine translation with LLMs on the rails. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

weichuanw/llm-based-mt-via-model-editing
pytorchOfficial

Videos

Mitigating the Language Mismatch and Repetition Issues in LLM-based Machine Translation via Model Editing· underline

Taxonomy

TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing

MethodsFocus