Simplifying Translations for Children: Iterative Simplification Considering Age of Acquisition with LLMs
Masashi Oshika, Makoto Morishita, Tsutomu Hirao, Ryohei Sasano, Koichi, Takeda

TL;DR
This paper introduces an iterative word simplification method for translations aimed at children, leveraging large language models to replace complex words with simpler ones based on Age of Acquisition, improving accessibility while maintaining translation quality.
Contribution
The study presents a novel LLM-based approach for iteratively simplifying translations by replacing high-AoA words, tailored for children's comprehension, and creates a new benchmark dataset for evaluation.
Findings
Effective replacement of high-AoA words with lower-AoA equivalents.
Maintains high BLEU and COMET scores after iterative simplification.
Demonstrates the method's ability to produce simpler yet accurate translations.
Abstract
In recent years, neural machine translation (NMT) has been widely used in everyday life. However, the current NMT lacks a mechanism to adjust the difficulty level of translations to match the user's language level. Additionally, due to the bias in the training data for NMT, translations of simple source sentences are often produced with complex words. In particular, this could pose a problem for children, who may not be able to understand the meaning of the translations correctly. In this study, we propose a method that replaces words with high Age of Acquisitions (AoA) in translations with simpler words to match the translations to the user's level. We achieve this by using large language models (LLMs), providing a triple of a source sentence, a translation, and a target word to be replaced. We create a benchmark dataset using back-translation on Simple English Wikipedia. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques
