How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation
Amr Hendy, Mohamed Abdelrehim, Amr Sharaf, Vikas Raunak, Mohamed Gabr,, Hitokazu Matsushita, Young Jin Kim, Mohamed Afify, Hany Hassan Awadalla

TL;DR
This paper thoroughly evaluates GPT models' performance in machine translation across various languages and settings, revealing strengths in high-resource languages and limitations in low-resource ones, with hybrid approaches offering improvements.
Contribution
It provides the first comprehensive comparison of GPT models with state-of-the-art translation systems across diverse languages and prompts, highlighting their capabilities and constraints.
Findings
GPT models perform well on high-resource languages.
Limited translation quality for low-resource languages.
Hybrid methods improve overall translation performance.
Abstract
Generative Pre-trained Transformer (GPT) models have shown remarkable capabilities for natural language generation, but their performance for machine translation has not been thoroughly investigated. In this paper, we present a comprehensive evaluation of GPT models for machine translation, covering various aspects such as quality of different GPT models in comparison with state-of-the-art research and commercial systems, effect of prompting strategies, robustness towards domain shifts and document-level translation. We experiment with eighteen different translation directions involving high and low resource languages, as well as non English-centric translations, and evaluate the performance of three GPT models: ChatGPT, GPT3.5 (text-davinci-003), and text-davinci-002. Our results show that GPT models achieve very competitive translation quality for high resource languages, while having…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Layer · Label Smoothing · Absolute Position Encodings · Adam · Multi-Head Attention · Position-Wise Feed-Forward Layer · Linear Warmup With Cosine Annealing
