A Small-Scale System for Autoregressive Program Synthesis Enabling Controlled Experimentation
Russ Webb, Jason Ramapuram

TL;DR
This paper introduces Cadmus, a small-scale, cost-effective system for program synthesis research, enabling detailed analysis of model behavior and out-of-distribution generalization using a custom dataset and an integer VM.
Contribution
The paper presents Cadmus, a novel small-scale system with a dataset and model architecture that allows controlled experimentation in program synthesis research.
Findings
Cadmus models outperform GPT-5 on integer program completion tasks.
Small models enable detailed instrumentation and analysis of reasoning processes.
GPT-5 exhibits unknown priors affecting its reasoning, complicating certain investigations.
Abstract
What research can be pursued with small models trained to complete true programs? Typically, researchers study program synthesis via large language models (LLMs) which introduce issues such as knowing what is in or out of distribution, understanding fine-tuning effects, understanding the effects of tokenization, and higher demand on compute and storage to carry out experiments. We present a system called Cadmus which includes an integer virtual machine (VM), a dataset composed of true programs of diverse tasks, and an autoregressive transformer model that is trained for under $200 of compute cost. The system can be used to study program completion, out-of-distribution representations, inductive reasoning, and instruction following in a setting where researchers have effective and affordable fine-grained control of the training distribution and the ability to inspect and instrument…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Teaching and Learning Programming
