Back to School: Translation Using Grammar Books
Jonathan Hus, Antonios Anastasopoulos

TL;DR
This paper explores leveraging grammar books as linguistic resources in GPT-4 prompts to enhance machine translation quality for 16 low-resource languages, demonstrating a novel way to utilize existing materials for language preservation and technology advancement.
Contribution
It introduces a novel method of incorporating grammar books into LLM prompts to improve translation for under-represented languages, expanding resource utilization beyond parallel corpora.
Findings
Improved translation quality for 16 low-resource languages.
Demonstrates the effectiveness of grammar books in LLM prompts.
Shows potential for broader application in low-resource language translation.
Abstract
Machine translation systems for high resource languages perform exceptionally well and produce high quality translations. Unfortunately, the vast majority of languages are not considered high resource and lack the quantity of parallel sentences needed to train such systems. These under-represented languages are not without resources, however, and bilingual dictionaries and grammar books are available as linguistic reference material. With current large language models (LLMs) supporting near book-length contexts, we can begin to use the available material to ensure advancements are shared among all of the world's languages. In this paper, we demonstrate incorporating grammar books in the prompt of GPT-4 to improve machine translation and evaluate the performance on 16 topologically diverse low-resource languages, using a combination of reference material to show that the machine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTranslation Studies and Practices
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Residual Connection · Position-Wise Feed-Forward Layer · Dense Connections · Softmax · Multi-Head Attention · Adam · Dropout
