TL;DR
Sibyl-AutoResearch introduces a self-evolving framework for autonomous scientific research that leverages trial-and-error harnesses to improve research workflows through systematic learning from failures and successes.
Contribution
The paper presents a novel framework with formal conversion units and implements it in SIBYL, enabling autonomous systems to learn from trial outcomes and improve research processes over iterations.
Findings
Identified eight high-confidence conversion events with quick iteration recovery.
Blocked or routed five common failure classes to improve system robustness.
Demonstrated the recoverability of conversion units in realistic autonomous research environments.
Abstract
Autonomous research systems increasingly make the scientific workflow executable: agents can propose ideas, run code, inspect results, and draft papers. But executable workflows do not by themselves produce research judgment. We analyze where current systems lose trial experience: weak evidence becomes prose, pilot signals become broad claims, memory remains textual, and recurring process failures do not change later behavior. We introduce Sibyl-AutoResearch, a self-evolving AutoResearch framework built around Scientific Trial-and-Error Harnesses. A harness lets agents run bounded trials, preserve positive and negative outcomes, and route lessons into later planning, validation, claim scope, scheduling, critique, writing, and harness repair. We formalize this through two auditable conversion units: trial-to-behavior conversion, which links trial signals to later research actions, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
