Towards a Benchmark for Dependency Decision-Making

Tanmay Singla; Berk \c{C}akar; Paschal C. Amusuo; James C. Davis

arXiv:2601.00205·cs.SE·February 2, 2026

Towards a Benchmark for Dependency Decision-Making

Tanmay Singla, Berk \c{C}akar, Paschal C. Amusuo, James C. Davis

PDF

Open Access 2 Datasets

TL;DR

This paper introduces DepDec-Bench, a benchmark for evaluating dependency decision-making in AI coding agents, emphasizing security, efficiency, and policy compliance beyond mere functional correctness.

Contribution

It presents a new benchmark and evaluation framework based on real-world dependency change data, highlighting security and policy considerations often overlooked.

Findings

01

Agents often select vulnerable dependency versions.

02

Dependency decisions can have negative security impacts.

03

Benchmark evaluates safe, disciplined dependency management.

Abstract

AI coding agents increasingly modify real software repositories and make dependency decisions, including adding, removing, or updating third-party packages. These choices can materially affect security posture and maintenance burden, yet repository-level evaluations largely emphasize test passing and executability without explicitly scoring whether systems (i) reuse existing dependencies, (ii) avoid unnecessary additions, or (iii) select versions that satisfy security and policy constraints. We propose DepDec-Bench, a benchmark for evaluating dependency decision-making beyond functional correctness. To ground DepDec-Bench in real-world behavior, we conduct a preliminary study of 117,062 dependency changes from agent- and human-authored pull requests across seven ecosystems. We show that coding agents frequently make dependency decisions with security consequences that remain invisible…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Information and Cyber Security · Advanced Malware Detection Techniques