A Small-Scale System for Autoregressive Program Synthesis Enabling Controlled Experimentation

Russ Webb; Jason Ramapuram

arXiv:2602.09112·cs.AI·February 11, 2026

A Small-Scale System for Autoregressive Program Synthesis Enabling Controlled Experimentation

Russ Webb, Jason Ramapuram

PDF

Open Access

TL;DR

This paper introduces Cadmus, a small-scale, cost-effective system for program synthesis research, enabling detailed analysis of model behavior and out-of-distribution generalization using a custom dataset and an integer VM.

Contribution

The paper presents Cadmus, a novel small-scale system with a dataset and model architecture that allows controlled experimentation in program synthesis research.

Findings

01

Cadmus models outperform GPT-5 on integer program completion tasks.

02

Small models enable detailed instrumentation and analysis of reasoning processes.

03

GPT-5 exhibits unknown priors affecting its reasoning, complicating certain investigations.

Abstract

What research can be pursued with small models trained to complete true programs? Typically, researchers study program synthesis via large language models (LLMs) which introduce issues such as knowing what is in or out of distribution, understanding fine-tuning effects, understanding the effects of tokenization, and higher demand on compute and storage to carry out experiments. We present a system called Cadmus which includes an integer virtual machine (VM), a dataset composed of true programs of diverse tasks, and an autoregressive transformer model that is trained for under $200 of compute cost. The system can be used to study program completion, out-of-distribution representations, inductive reasoning, and instruction following in a setting where researchers have effective and affordable fine-grained control of the training distribution and the ability to inspect and instrument…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Teaching and Learning Programming