AlphaGo Moment for Model Architecture Discovery
Yixiu Liu, Yang Nan, Weixian Xu, Xiangkun Hu, Lyumanshan Ye, Zhen Qin, Pengfei Liu

TL;DR
This paper introduces ASI-Arch, an autonomous AI system that innovates neural architectures beyond human-designed spaces, leading to state-of-the-art results and establishing a scalable law for scientific discovery.
Contribution
We present the first fully autonomous system for neural architecture discovery that conducts scientific research independently, surpassing traditional NAS limitations.
Findings
Discovered 106 state-of-the-art linear attention architectures.
Conducted 1,773 autonomous experiments over 20,000 GPU hours.
Established an empirical scaling law for scientific discovery.
Abstract
While AI systems demonstrate exponentially improving capabilities, the pace of AI research itself remains linearly bounded by human cognitive capacity, creating an increasingly severe development bottleneck. We present ASI-Arch, the first demonstration of Artificial Superintelligence for AI research (ASI4AI) in the critical domain of neural architecture discovery--a fully autonomous system that shatters this fundamental constraint by enabling AI to conduct its own architectural innovation. Moving beyond traditional Neural Architecture Search (NAS), which is fundamentally limited to exploring human-defined spaces, we introduce a paradigm shift from automated optimization to automated innovation. ASI-Arch can conduct end-to-end scientific research in the domain of architecture discovery, autonomously hypothesizing novel architectural concepts, implementing them as executable code,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Software System Performance and Reliability · Advanced Software Engineering Methodologies
