AlphaLab: Autonomous Multi-Agent Research Across Optimization Domains with Frontier LLMs

Brendan R. Hogan; Xiwen Chen; James T. Wilson; Kashif Rasul; Adel Boyarsky; Thomas Kamei; Anderson Schneider; Yuriy Nevmyvaka

arXiv:2604.08590·cs.LG·April 13, 2026

AlphaLab: Autonomous Multi-Agent Research Across Optimization Domains with Frontier LLMs

Brendan R. Hogan, Xiwen Chen, James T. Wilson, Kashif Rasul, Adel Boyarsky, Thomas Kamei, Anderson Schneider, Yuriy Nevmyvaka

PDF

1 Repo

TL;DR

AlphaLab is an autonomous research system utilizing frontier LLMs to automate experimental cycles across diverse domains, achieving significant performance improvements without human intervention.

Contribution

It introduces a fully autonomous, adaptable pipeline that leverages frontier LLMs for multi-domain research, including code generation, evaluation, and large-scale experiments.

Findings

01

GPU kernel optimization: 4.4x faster than torch.compile on average

02

LLM pretraining: 22% lower validation loss than baseline

03

Traffic forecasting: 23-25% improvement over standard baselines

Abstract

We present AlphaLab, an autonomous research harness that leverages frontier LLM agentic capabilities to automate the full experimental cycle in quantitative, computation-intensive domains. Given only a dataset and a natural-language objective, AlphaLab proceeds through three phases without human intervention: (1) it adapts to the domain and explores the data, writing analysis code and producing a research report; (2) it constructs and adversarially validates its own evaluation framework; and (3) it runs large-scale GPU experiments via a Strategist/Worker loop, accumulating domain knowledge in a persistent playbook that functions as a form of online prompt optimization. All domain-specific behavior is factored into adapters generated by the model itself, so the same pipeline handles qualitatively different tasks without modification. We evaluate AlphaLab with two frontier LLMs (GPT-5.2…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://brendanhogan.github.io/alphalab-paper
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.