Training with Pseudo-Code for Instruction Following
Prince Kumar, Rudra Murthy, Riyaz Bhat, Danish Contractor

TL;DR
This paper introduces a training method that enhances large language models' ability to follow instructions by incorporating pseudo-code representations during fine-tuning, leading to significant improvements across various benchmarks.
Contribution
The authors propose a novel instruction-tuning approach using pseudo-code data, improving instruction-following performance without sacrificing reasoning capabilities.
Findings
8-21% improvement on instruction-following benchmarks
Up to 30% average gain across multiple tasks
Models maintain or improve reasoning performance
Abstract
Despite rapid advances in the capabilities of Large Language Models (LLMs), they continue to struggle with following relatively simple and unambiguous instructions, particularly when compositional structure is involved. Recent work suggests that models may follow instructions more effectively when they are expressed in pseudo-code rather than natural language. However, writing pseudo-code programs can be tedious, and relying on few-shot demonstrations or inference-time code prompting is often unnatural for non-expert users of LLMs. To overcome these limitations, we propose a training time approach that fine-tunes LLMs using instruction-tuning data augmented with pseudo-code representations of natural language instructions paired with final responses. We evaluate our method on 12 publicly available benchmarks spanning instruction-following, mathematical reasoning, and commonsense…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
