Evaluating o1-Like LLMs: Unlocking Reasoning for Translation through   Comprehensive Analysis

Andong Chen; Yuchen Song; Wenxin Zhu; Kehai Chen; Muyun Yang; Tiejun; Zhao; Min zhang

arXiv:2502.11544·cs.CL·February 18, 2025·2 cites

Evaluating o1-Like LLMs: Unlocking Reasoning for Translation through Comprehensive Analysis

Andong Chen, Yuchen Song, Wenxin Zhu, Kehai Chen, Muyun Yang, Tiejun, Zhao, Min zhang

PDF

Open Access

TL;DR

This paper evaluates o1-Like LLMs in multilingual machine translation, revealing their strengths, limitations, and factors affecting translation quality, with implications for resource use and parameter tuning.

Contribution

It provides a comprehensive analysis of o1-Like LLMs' translation performance, comparing them with traditional models and identifying key factors influencing quality.

Findings

01

o1-Like LLMs set new translation benchmarks

02

DeepSeek-R1 outperforms GPT-4o in contextless tasks

03

Translation quality improves with model size and lower temperature

Abstract

The o1-Like LLMs are transforming AI by simulating human cognitive processes, but their performance in multilingual machine translation (MMT) remains underexplored. This study examines: (1) how o1-Like LLMs perform in MMT tasks and (2) what factors influence their translation quality. We evaluate multiple o1-Like LLMs and compare them with traditional models like ChatGPT and GPT-4o. Results show that o1-Like LLMs establish new multilingual translation benchmarks, with DeepSeek-R1 surpassing GPT-4o in contextless tasks. They demonstrate strengths in historical and cultural translation but exhibit a tendency for rambling issues in Chinese-centric outputs. Further analysis reveals three key insights: (1) High inference costs and slower processing speeds make complex translation tasks more resource-intensive. (2) Translation quality improves with model size, enhancing commonsense reasoning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques