Beyond pip install: Evaluating LLM Agents for the Automated Installation   of Python Projects

Louis Milliken; Sungmin Kang; Shin Yoo

arXiv:2412.06294·cs.SE·December 10, 2024

Beyond pip install: Evaluating LLM Agents for the Automated Installation of Python Projects

Louis Milliken, Sungmin Kang, Shin Yoo

PDF

Open Access 1 Repo

TL;DR

This paper introduces a benchmark and an agent for automating the installation of Python repositories using LLMs, highlighting current capabilities and challenges in autonomous software engineering tasks.

Contribution

It presents a new benchmark for repository installation tasks and the Installamatic agent, advancing the evaluation of LLMs in automating dependency management.

Findings

01

55% of repositories can be installed at least once by the agent

02

Identified key challenges in automating repository installation

03

Provided insights into improving LLM-based software engineering agents

Abstract

Many works have recently proposed the use of Large Language Model (LLM) based agents for performing `repository level' tasks, loosely defined as a set of tasks whose scopes are greater than a single file. This has led to speculation that the orchestration of these repository-level tasks could lead to software engineering agents capable of performing almost independently of human intervention. However, of the suite of tasks that would need to be performed by this autonomous software engineering agent, we argue that one important task is missing, which is to fulfil project level dependency by installing other repositories. To investigate the feasibility of this repository level installation task, we introduce a benchmark of of repository installation tasks curated from 40 open source Python projects, which includes a ground truth installation process for each target repository. Further,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

coinse/installamatic
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management