FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale

Runyuan He; Qiuyang Mang; Shang Zhou; Kaiyuan Liu; Hanchen Li; Huanzhi Mao; Qizheng Zhang; Zerui Li; Bo Peng; Lufeng Cheng; Tianfu Fu; Yichuan Wang; Wenhao Chai; Jingbo Shang; Alex Dimakis; Joseph E. Gonzalez; Alvin Cheung

arXiv:2605.14445·cs.LG·May 15, 2026

FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale

Runyuan He, Qiuyang Mang, Shang Zhou, Kaiyuan Liu, Hanchen Li, Huanzhi Mao, Qizheng Zhang, Zerui Li, Bo Peng, Lufeng Cheng, Tianfu Fu, Yichuan Wang, Wenhao Chai, Jingbo Shang, Alex Dimakis, Joseph E. Gonzalez, Alvin Cheung

PDF

1 Repo 1 Datasets

TL;DR

FrontierSmith is an automated system that synthesizes open-ended coding problems from existing tasks to improve large language model coding performance, demonstrating significant gains on benchmark datasets.

Contribution

It introduces a novel iterative approach to generate diverse open-ended coding problems from closed-ended tasks for training stronger LLM coders.

Findings

01

Training on synthesized problems improves model scores significantly.

02

Synthesized problems lead to longer, more human-like coding interactions.

03

The system effectively creates diverse, challenging coding problems from existing datasets.

Abstract

Many real-world coding challenges are open-ended and admit no known optimal solution. Yet, recent progress in LLM coding has focused on well-defined tasks such as feature implementation, bug fixing, and competitive programming. Open-ended coding remains a weak spot for LLMs, largely because open-ended training problems are scarce and expensive to construct. Our goal is to synthesize open-ended coding problems at scale to train stronger LLM coders. We introduce FrontierSmith, an automated system for iteratively evolving open-ended problems from existing closed-ended coding tasks. Starting from competitive programming problems, FrontierSmith generates candidate open-ended variants by changing the problems'goals, restricting outputs, and generalizing inputs. It then uses a quantitative idea divergence metric to select problems that elicit genuinely diverse approaches from different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

frontiercs/FrontierSmith
github

Datasets

aoiandroid/papers
dataset· 28 dl
28 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.