AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Jiaqi Liu; Shi Qiu; Mairui Li; Bingzhou Li; Haonian Ji; Siwei Han; Xinyu Ye; Peng Xia; Zihan Dong; Congyu Zhang; Letian Zhang; Guiming Chen; Haoqin Tu; Xinyu Yang; Lu Feng; Xujiang Zhao; Haifeng Chen; Jiawei Zhou; Xiao Wang; Weitong Zhang; Hongtu Zhu; Yun Li; Jieru Mei; Hongliang Fei; Jiaheng Zhang; Linjie Li; Linjun Zhang; Yuyin Zhou; Sheng Wang; Caiming Xiong; James Zou; Zeyu Zheng; Cihang Xie; Mingyu Ding; Huaxiu Yao

arXiv:2605.20025·cs.AI·May 20, 2026

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Jiaqi Liu, Shi Qiu, Mairui Li, Bingzhou Li, Haonian Ji, Siwei Han, Xinyu Ye, Peng Xia, Zihan Dong, Congyu Zhang, Letian Zhang, Guiming Chen, Haoqin Tu, Xinyu Yang, Lu Feng, Xujiang Zhao, Haifeng Chen, Jiawei Zhou, Xiao Wang, Weitong Zhang, Hongtu Zhu, Yun Li, Jieru Mei

PDF

1 Repo 1 Datasets

TL;DR

AutoResearchClaw is a multi-agent autonomous research system that enhances scientific discovery through iterative hypothesis testing, human collaboration, and experience accumulation, outperforming previous AI systems on a benchmark.

Contribution

The paper introduces AutoResearchClaw, a novel multi-agent pipeline with mechanisms for debate, self-healing execution, verifiable results, human-in-the-loop collaboration, and cross-run learning.

Findings

01

AutoResearchClaw outperforms AI Scientist v2 by 54.7% on ARC-Bench.

02

Targeted human intervention at key decision points improves performance.

03

Cross-run learning converts past mistakes into future safeguards.

Abstract

Automating scientific discovery requires more than generating papers from ideas. Real research is iterative: hypotheses are challenged from multiple perspectives, experiments fail and inform the next attempt, and lessons accumulate across cycles. Existing autonomous research systems often model this process as a linear pipeline: they rely on single-agent reasoning, stop when execution fails, and do not carry experience across runs. We present AutoResearchClaw, a multi-agent autonomous research pipeline built on five mechanisms: structured multi-agent debate for hypothesis generation and result analysis, a self-healing executor with a \textsc{Pivot}/\textsc{Refine} decision loop that transforms failures into information, verifiable result reporting that prevents fabricated numbers and hallucinated citations, human-in-the-loop collaboration with seven intervention modes spanning full…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aiming-lab/AutoResearchClaw
github

Datasets

AIMING-Lab-UNC/ARC-Bench
dataset· 292 dl
292 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.