Bridging the Knowledge Void: Inference-time Acquisition of Unfamiliar Programming Languages for Coding Tasks
Chen Shen, Wei Cheng, Jingyue Yang, Huan Zhang, Yuhan Wu, Wei Hu

TL;DR
This paper introduces ILA-agent, a framework enabling LLMs to learn unfamiliar programming languages during inference through interaction with documentation and environments, demonstrated on the Cangjie benchmark.
Contribution
We propose the ILA-agent framework for inference-time language acquisition, allowing LLMs to learn new languages without finetuning, and create Cangjie-bench for evaluation.
Findings
ILA-agent outperforms retrieval-augmented baselines in code tasks.
The framework enables incremental language learning during inference.
Analysis reveals emergent behavior patterns and remaining performance gaps.
Abstract
The proficiency of Large Language Models (LLMs) in coding tasks is often a reflection of their extensive pre-training corpora, which typically collapses when confronted with previously unfamiliar programming languages. Departing from data-intensive finetuning, we investigate the paradigm of Inference-time Language Acquisition (ILA), where an LLM masters an unfamiliar language through dynamic interaction with limited external resources. In this paper, we propose ILA-agent, a general ILA framework that equips LLMs with a set of behavioral primitives. By modeling essential human-like behaviors as a suite of tools, ILA-agent enables LLMs to incrementally explore, apply, and verify language knowledge through structured interactions with the official documentation and execution environment. To provide a rigorous evaluation in a low-resource setting, we construct Cangjie-bench, a multi-task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Teaching and Learning Programming · Topic Modeling
