Leveraging Language Models to Discover Evidence-Based Actions for OSS Sustainability
Nafiz Imtiaz Khan, Vladimir Filkov

TL;DR
This paper introduces a novel approach using large language models to extract evidence-based, actionable recommendations from software engineering literature, aiming to improve open source software sustainability.
Contribution
It develops a reproducible pipeline that leverages LLMs to systematically extract and organize concrete practices from research literature for OSS sustainability.
Findings
Extracted 1,922 actionable recommendations from 829 papers.
Filtered and validated 1,312 high-quality evidence-linked actions.
Organized recommendations into categories connectable to project signals.
Abstract
When successful, Open Source Software (OSS) projects create enormous value, but most never reach a sustainable state. Recent work has produced accurate models that forecast OSS sustainability, yet these models rarely tell maintainers what to do: their features are often high-level socio-technical signals that are not directly actionable. Decades of empirical software engineering research have accumulated a large but underused body of evidence on concrete practices that improve project health. We close this gap by using LLMs as evidence miners over the SE literature. We design a RAG-pipeline and a two-layer prompting strategy that extract researched actionables (ReACTs): concise, evidence-linked recommendations mapping to specific OSS practices. In the first layer, we systematically explore open LLMs and prompting techniques, selecting the best-performing combination to derive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGreen IT and Sustainability · Open Source Software Innovations · Software Engineering Research
