Creative and Context-Aware Translation of East Asian Idioms with GPT-4
Kenan Tang, Peiyang Song, Yao Qin, Xifeng Yan

TL;DR
This paper explores using GPT-4 to generate high-quality, context-aware translations of East Asian idioms, outperforming existing translation engines and reducing the effort required for human translators.
Contribution
It introduces effective prompting strategies for GPT-4 that enhance translation quality and demonstrates that GPT-4 can produce more high-quality translations per idiom at lower cost than human efforts.
Findings
GPT-4 outperforms Google and DeepL in faithfulness and creativity.
Context-aware prompting significantly improves translation quality.
Open-source code and data support further research.
Abstract
As a type of figurative language, an East Asian idiom condenses rich cultural background into only a few characters. Translating such idioms is challenging for human translators, who often resort to choosing a context-aware translation from an existing list of candidates. However, compiling a dictionary of candidate translations demands much time and creativity even for expert translators. To alleviate such burden, we evaluate if GPT-4 can help generate high-quality translations. Based on automatic evaluations of faithfulness and creativity, we first identify Pareto-optimal prompting strategies that can outperform translation engines from Google and DeepL. Then, at a low cost, our context-aware translations can achieve far more high-quality translations per idiom than the human baseline. We open-source all code and data to facilitate further research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Biomedical Text Mining and Ontologies
MethodsLinear Layer · Multi-Head Attention · Layer Normalization · Dense Connections · Attention Is All You Need · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding
