AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery

Yu Li; Chenyang Shao; Xinyang Liu; Ruotong Zhao; Peijie Liu; Hongyuan Su; Zhibin Chen; Qinglong Yang; Anjie Xu; Yi Fang; Qingbin Zeng; Tianxing Li; Jingbo Xu; Fengli Xu; Yong Li; Tie-Yan Liu

arXiv:2604.05550·cs.CL·April 8, 2026

AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery

Yu Li, Chenyang Shao, Xinyang Liu, Ruotong Zhao, Peijie Liu, Hongyuan Su, Zhibin Chen, Qinglong Yang, Anjie Xu, Yi Fang, Qingbin Zeng, Tianxing Li, Jingbo Xu, Fengli Xu, Yong Li, Tie-Yan Liu

PDF

TL;DR

AutoSOTA is an automated system that accelerates AI research by reproducing, optimizing, and discovering new state-of-the-art models through a multi-agent architecture, reducing manual effort and fostering innovation.

Contribution

The paper introduces AutoSOTA, a novel end-to-end automated research system that advances the latest AI models to new SOTA levels with minimal human intervention.

Findings

01

AutoSOTA successfully discovered 105 new SOTA models surpassing original methods.

02

The system achieves an average of five hours per paper for replication and optimization.

03

Case studies demonstrate AutoSOTA's ability to identify architectural and algorithmic innovations.

Abstract

Artificial intelligence research increasingly depends on prolonged cycles of reproduction, debugging, and iterative refinement to achieve State-Of-The-Art (SOTA) performance, creating a growing need for systems that can accelerate the full pipeline of empirical model optimization. In this work, we introduce AutoSOTA, an end-to-end automated research system that advances the latest SOTA models published in top-tier AI papers to reproducible and empirically improved new SOTA models. We formulate this problem through three tightly coupled stages: resource preparation and goal setting; experiment evaluation; and reflection and ideation. To tackle this problem, AutoSOTA adopts a multi-agent architecture with eight specialized agents that collaboratively ground papers to code and dependencies, initialize and repair execution environments, track long-horizon experiments, generate and schedule…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.