HerAgent: Rethinking the Automated Environment Deployment via Hierarchical Test Pyramid
Xiang Li, Siyu Lu, Federica Sarro, Claire Le Goues, He Ye

TL;DR
HerAgent is an automated environment setup tool that uses a hierarchical, execution-based validation approach to reliably configure complex software projects, significantly outperforming previous methods.
Contribution
The paper introduces the Environment Maturity Hierarchy and HerAgent, a novel approach that incrementally constructs executable environments guided by execution success, improving reliability over prior dependency-based methods.
Findings
HerAgent achieves up to 79.6% improvement on benchmarks.
It surpasses prior approaches by 66.7% on complex C/C++ projects.
HerAgent resolves 11-30 environment instances that others cannot configure.
Abstract
Automated software environment setup is a prerequisite for testing, debugging, and reproducing failures, yet remains challenging in practice due to complex dependencies, heterogeneous build systems, and incomplete documentation. Recent work leverages large language models to automate this process, but typically evaluates success using weak signals such as dependency installation or partial test execution, which do not ensure that a project can actually run. In this paper, we argue that environment setup success should be evaluated through executable evidence rather than a single binary signal. We introduce the Environment Maturity Hierarchy, which defines three success levels based on progressively stronger execution requirements, culminating in successful execution of a project's main entry point. Guided by this hierarchy, we propose HerAgent, an automated environment setup approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software System Performance and Reliability · Software Engineering Research
