DocFetch - Towards Generating Software Documentation from Multiple Software Artifacts
Akhila Sri Manasa Venigalla, Sridhar Chimalakonda

TL;DR
DocFetch is a novel approach that leverages multiple software artifacts and large language models to semi-automatically generate comprehensive software documentation, addressing the scattered nature of information across artifacts.
Contribution
It introduces a multi-layer prompt based LLM framework to generate various documentation types from multiple artifacts, improving over source code-only methods.
Findings
Achieved BLEU-4 score of 43.24% for API and file info generation.
Reported BLEU-4 scores close to 30% for other documentation types.
Demonstrated effective semi-automatic documentation generation with minimal effort.
Abstract
Software Documentation plays a major role in the usage and development of a project. Widespread adoption of open source software projects contributes to larger and faster development of the projects, making it difficult to maintain the associated documentation. Existing automated approaches to generate documentation largely focus on source code. However, information useful for documentation is observed to be scattered across various artifacts that co-evolve with the source code. Leveraging this information across multiple artifacts can reduce the effort involved in maintaining documentation. Hence, we propose DocFetch, to generate different types of documentation from multiple software artifacts. We employ a multi-layer prompt based LLM and generate structured documentation corresponding to different documentation types for the data consolidated in DocMine dataset. We evaluate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
