NeoAMT: Neologism-Aware Agentic Machine Translation with Reinforcement Learning

Zhongtao Miao; Kaiyan Zhao; Masaaki Nagata; Yoshimasa Tsuruoka

arXiv:2601.03790·cs.CL·April 28, 2026

NeoAMT: Neologism-Aware Agentic Machine Translation with Reinforcement Learning

Zhongtao Miao, Kaiyan Zhao, Masaaki Nagata, Yoshimasa Tsuruoka

PDF

TL;DR

This paper introduces NeoAMT, a reinforcement learning-based framework for translating sentences with neologisms, utilizing a Wiktionary-based search toolkit and a new multilingual dataset.

Contribution

It presents a novel agentic framework for neologism-aware machine translation, including a dedicated dataset, a search toolkit, and an RL training strategy with a new reward design.

Findings

01

Constructed a multilingual dataset with 16 languages and 75 translation directions.

02

Developed a Wiktionary-based search toolkit for neologism translation.

03

Proposed an RL training framework with a novel reward and adaptive rollout strategy.

Abstract

Neologism-aware machine translation aims to translate source sentences containing neologisms into target languages. This field remains underexplored compared with general machine translation (MT). In this paper, we propose an agentic framework, NeoAMT, for neologism-aware machine translation equipped with a Wiktionary-based search toolkit. Specifically, we first construct a dedicated dataset for neologism-aware machine translation and build a search toolkit grounded in Wiktionary. The dataset covers 16 languages and 75 translation directions in total, derived from approximately 10 million records of an English Wiktionary dump. The retrieval corpus of the search toolkit is also constructed from around 3 million cleaned records of the same dump. We then leverage the dataset and toolkit to train a translation agent via reinforcement learning (RL) and to evaluate the accuracy of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.