Chinese Language Is Not More Efficient Than English in Vibe Coding: A Preliminary Study on Token Cost and Problem-Solving Rate
Simiao Ren, Xingyu Shen, Yuchen Zhou, Dennis (Tsang) Ng, Ankit Raj

TL;DR
This study empirically evaluates the claim that Chinese prompts are more token-efficient than English in LLM coding tasks, finding no consistent efficiency advantage and lower success rates in Chinese.
Contribution
It provides a rigorous empirical analysis showing that Chinese does not offer token cost benefits and has lower success rates, challenging common assumptions.
Findings
Chinese does not have a token efficiency advantage.
Token costs vary unpredictably across models.
Chinese prompts generally yield lower success rates.
Abstract
A claim has been circulating on social media and practitioner forums that Chinese prompts are more token-efficient than English for LLM coding tasks, potentially reducing costs by up to 40\%. This claim has influenced developers to consider switching to Chinese for ``vibe coding'' to save on API costs. In this paper, we conduct a rigorous empirical study using SWE-bench Lite, a benchmark of software engineering tasks, to evaluate whether this claim of Chinese token efficiency holds up to scrutiny. Our results reveal three key findings: First, the efficiency advantage of Chinese is not observed. Second, token cost varies by model architecture in ways that defy simple assumptions: while MiniMax-2.7 shows 1.28x higher token costs for Chinese, GLM-5 actually consumes fewer tokens with Chinese prompts. Third, and most importantly, we found that the success rate when prompting in Chinese is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
