Automated Extraction and Analysis of Developer's Rationale in Open Source Software

Mouna Dhaouadi; Bentley Oakes; Michalis Famelis

arXiv:2506.11005·cs.SE·June 16, 2025

Automated Extraction and Analysis of Developer's Rationale in Open Source Software

Mouna Dhaouadi, Bentley Oakes, Michalis Famelis

PDF

Open Access

TL;DR

This paper presents an automated method using advanced language models to extract and analyze developers' rationale in open source projects, helping identify conflicts and potential design erosion over time.

Contribution

It introduces a novel automated rationale extraction approach based on Kantara architecture, leveraging pre-trained and large language models for conflict detection in open source software.

Findings

01

Feasible extraction of rationale sentences demonstrated on Linux Kernel's OOM-Killer module.

02

Effective detection of reasoning conflicts and potential design erosion.

03

Approach generalizes well to multiple active open source projects.

Abstract

Contributors to open source software must deeply understand a project's history to make coherent decisions which do not conflict with past reasoning. However, inspecting all related changes to a proposed contribution requires intensive manual effort, and previous research has not yet produced an automated mechanism to expose and analyze these conflicts. In this article, we propose such an automated approach for rationale analyses, based on an instantiation of Kantara, an existing high-level rationale extraction and management architecture. Our implementation leverages pre-trained models and Large Language Models, and includes structure-based mechanisms to detect reasoning conflicts and problems which could cause design erosion in a project over time. We show the feasibility of our extraction and analysis approach using the OOM-Killer module of the Linux Kernel project, and investigate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research