TL;DR
STELLA is a self-evolving AI agent for biomedical research that autonomously improves its reasoning and tools, achieving state-of-the-art accuracy and demonstrating systematic performance growth through experience.
Contribution
Introduces STELLA, a novel self-evolving multi-agent system that autonomously enhances its capabilities and toolset for biomedical research tasks.
Findings
Achieves state-of-the-art accuracy on biomedical benchmarks.
Performance of STELLA improves systematically with experience.
Outperforms leading models by up to 6 percentage points.
Abstract
The rapid growth of biomedical data, tools, and literature has created a fragmented research landscape that outpaces human expertise. While AI agents offer a solution, they typically rely on static, manually curated toolsets, limiting their ability to adapt and scale. Here, we introduce STELLA, a self-evolving AI agent designed to overcome these limitations. STELLA employs a multi-agent architecture that autonomously improves its own capabilities through two core mechanisms: an evolving Template Library for reasoning strategies and a dynamic Tool Ocean that expands as a Tool Creation Agent automatically discovers and integrates new bioinformatics tools. This allows STELLA to learn from experience. We demonstrate that STELLA achieves state-of-the-art accuracy on a suite of biomedical benchmarks, scoring approximately 26\% on Humanity's Last Exam: Biomedicine, 54\% on LAB-Bench: DBQA, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
