Promises, Perils, and (Timely) Heuristics for Mining Coding Agent Activity
Romain Robbes, Th\'eo Matricon, Thomas Degueule, Andre Hora, Stefano Zacchiroli

TL;DR
This paper examines the rapid adoption of coding agents leveraging LLMs, analyzing their impact on software engineering practices through repository traces, highlighting benefits, risks, and effective heuristics.
Contribution
It provides a comprehensive analysis of coding agent activity on GitHub, introducing new heuristics and insights into their promises and perils for software engineering.
Findings
Coding agents leave detectable traces in repositories
They significantly influence software development practices
Heuristics can mitigate associated risks
Abstract
In 2025, coding agents have seen a very rapid adoption. Coding agents leverage Large Language Models (LLMs) in ways that are markedly different from LLM-based code completion, making their study critical. Moreover, unlike LLM-based completion, coding agents leave visible traces in software repositories, enabling the use of MSR techniques to study their impact on SE practices. This paper documents the promises, perils, and heuristics that we have gathered from studying coding agent activity on GitHub.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Ethics and Social Impacts of AI · Software Engineering Research
