From I/O to Code with Discovery Agent
Yihong Dong, Jiaru Qian, Haoran Zhang, Peixu Wang, Binhua Li, Zhi Jin, Yongbin Li, Ge Li, Xiaokang Yang, Xue Jiang

TL;DR
This paper introduces DIO-Agent, an evolutionary search framework using LLMs for synthesizing programs from input-output behavior, significantly improving over existing methods across various difficulty levels.
Contribution
The paper presents a novel evolutionary approach with a mutation prior to guide program synthesis from IO behavior, outperforming state-of-the-art baselines.
Findings
DIO-Agent outperforms traditional and SOTA evolution methods across all difficulty levels.
The mutation prior biases the search towards simpler hypotheses, improving efficiency.
Extensive experiments demonstrate superior performance over various LLMs and scaling strategies.
Abstract
The automatic synthesis of a program from any form of specification is regarded as a holy grail of computer science. Fueled by LLMs, NL2Code has achieved tremendous success, yet the fundamentally more challenging task of synthesizing programs from input-output behavior, which we refer to as IO2Code, remains largely unsolved. Whereas NL2Code can exploit the semantic alignment between natural language and code acquired during pretraining, IO2Code requires recovering underlying principles from concrete computational behavior, navigating a vast and underspecified hypothesis space. To address this, we propose DIO-Agent, a discovery agent for IO2Code. Our method frames IO2Code as an evolutionary search over discrete program space, in which an LLM serves as the mutation operator and concrete error signals from execution guide each mutation. To prevent the search from wandering into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
