Doc-Guided Sent2Sent++: A Sent2Sent++ Agent with Doc-Guided memory for   Document-level Machine Translation

Jiaxin Guo; Yuanchang Luo; Daimeng Wei; Ling Zhang; Zongyao Li,; Hengchao Shang; Zhiqiang Rao; Shaojun Li; Jinlong Yang; Zhanglin Wu; Hao; Yang

arXiv:2501.08523·cs.CL·January 16, 2025

Doc-Guided Sent2Sent++: A Sent2Sent++ Agent with Doc-Guided memory for Document-level Machine Translation

Jiaxin Guo, Yuanchang Luo, Daimeng Wei, Ling Zhang, Zongyao Li,, Hengchao Shang, Zhiqiang Rao, Shaojun Li, Jinlong Yang, Zhanglin Wu, Hao, Yang

PDF

Open Access

TL;DR

This paper presents Doc-Guided Sent2Sent++, an innovative agent for document-level machine translation that uses a doc-guided memory and incremental decoding to improve translation quality, consistency, and fluency across multiple languages.

Contribution

It introduces the Sent2Sent++ decoding method and a Doc-Guided Memory mechanism, demonstrating significant improvements over existing approaches in document-level translation.

Findings

01

Outperforms existing methods in quality, consistency, and fluency metrics.

02

Achieves significant improvements in s-COMET, d-COMET, LTCR-1f, and document perplexity.

03

Effective across multiple languages and domains.

Abstract

The field of artificial intelligence has witnessed significant advancements in natural language processing, largely attributed to the capabilities of Large Language Models (LLMs). These models form the backbone of Agents designed to address long-context dependencies, particularly in Document-level Machine Translation (DocMT). DocMT presents unique challenges, with quality, consistency, and fluency being the key metrics for evaluation. Existing approaches, such as Doc2Doc and Doc2Sent, either omit sentences or compromise fluency. This paper introduces Doc-Guided Sent2Sent++, an Agent that employs an incremental sentence-level forced decoding strategy \textbf{to ensure every sentence is translated while enhancing the fluency of adjacent sentences.} Our Agent leverages a Doc-Guided Memory, focusing solely on the summary and its translation, which we find to be an efficient approach to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling