Sibyl-AutoResearch: Autonomous Research Needs Self-Evolving Trial-and-Error Harnesses, Not Paper Generators

Chengcheng Wang; Qinhua Xie; Wei He; Jianyuan Guo; Shiqi Wang; Chang Xu

arXiv:2605.22343·cs.MA·May 22, 2026

Sibyl-AutoResearch: Autonomous Research Needs Self-Evolving Trial-and-Error Harnesses, Not Paper Generators

Chengcheng Wang, Qinhua Xie, Wei He, Jianyuan Guo, Shiqi Wang, Chang Xu

PDF

1 Repo

TL;DR

Sibyl-AutoResearch introduces a self-evolving framework for autonomous scientific research that leverages trial-and-error harnesses to improve research workflows through systematic learning from failures and successes.

Contribution

The paper presents a novel framework with formal conversion units and implements it in SIBYL, enabling autonomous systems to learn from trial outcomes and improve research processes over iterations.

Findings

01

Identified eight high-confidence conversion events with quick iteration recovery.

02

Blocked or routed five common failure classes to improve system robustness.

03

Demonstrated the recoverability of conversion units in realistic autonomous research environments.

Abstract

Autonomous research systems increasingly make the scientific workflow executable: agents can propose ideas, run code, inspect results, and draft papers. But executable workflows do not by themselves produce research judgment. We analyze where current systems lose trial experience: weak evidence becomes prose, pilot signals become broad claims, memory remains textual, and recurring process failures do not change later behavior. We introduce Sibyl-AutoResearch, a self-evolving AutoResearch framework built around Scientific Trial-and-Error Harnesses. A harness lets agents run bounded trials, preserve positive and negative outcomes, and route lessons into later planning, validation, claim scope, scheduling, critique, writing, and harness repair. We formalize this through two auditable conversion units: trial-to-behavior conversion, which links trial signals to later research actions, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Sibyl-Research-Team/AutoResearch-SibylSystem
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.