TL;DR
ARIS is an open-source framework for autonomous research that employs adversarial multi-agent collaboration to improve the reliability and integrity of long-horizon scientific workflows.
Contribution
It introduces a novel multi-layered research harness that coordinates models and assurance mechanisms for autonomous scientific research.
Findings
ARIS includes over 65 reusable skills and model integrations.
It features a three-stage claim verification process.
A prototype self-improvement loop enhances research traceability.
Abstract
This report describes ARIS (Auto-Research-in-sleep), an open-source research harness for autonomous research, including its architecture, assurance mechanisms, and early deployment experience. The performance of agent systems built on LLMs depends on both the model weights and the harness around them, which governs what information to store, retrieve, and present to the model. For long-horizon research workflows, the central failure mode is not a visible breakdown but a plausible unsupported success: a long-running agent can produce claims whose evidential support is incomplete, misreported, or silently inherited from the executor's framing. Therefore, we present ARIS as a research harness that coordinates machine-learning research workflows through cross-model adversarial collaboration as a default configuration: an executor model drives forward progress while a reviewer from a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
