CangjieBench: Benchmarking LLMs on a Low-Resource General-Purpose Programming Language
Junhang Cheng, Fang Liu, Jia Li, Chengru Wu, Nanxiang Jiang, Li Zhang

TL;DR
CangjieBench introduces a contamination-free benchmark for evaluating large language models on a low-resource, general-purpose programming language, revealing insights into model performance and transfer phenomena.
Contribution
The paper presents CangjieBench, a novel benchmark for low-resource general-purpose language Cangjie, including a comprehensive evaluation of LLMs across multiple generation settings.
Findings
Syntax-Constrained Generation balances accuracy and cost
Agent models achieve highest accuracy but with high token use
Code-to-Code translation often underperforms Text-to-Code
Abstract
Large Language Models excel in high-resource programming languages but struggle with low-resource ones. Existing research related to low-resource programming languages primarily focuses on Domain-Specific Languages (DSLs), leaving general-purpose languages that suffer from data scarcity underexplored. To address this gap, we introduce CangjieBench, a contamination-free benchmark for Cangjie, a representative low-resource general-purpose language. The benchmark comprises 248 high-quality samples manually translated from HumanEval and ClassEval, covering both Text-to-Code and Code-to-Code tasks. We conduct a systematic evaluation of diverse LLMs under four settings: Direct Generation, Syntax-Constrained Generation, Retrieval-Augmented Generation (RAG), and Agent. Experiments reveal that Direct Generation performs poorly, whereas Syntax-Constrained Generation offers the best trade-off…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
