LARCH: Large Language Model-based Automatic Readme Creation with   Heuristics

Yuta Koreeda; Terufumi Morishita; Osamu Imaichi; Yasuhiro Sogawa

arXiv:2308.03099·cs.CL·August 23, 2023

LARCH: Large Language Model-based Automatic Readme Creation with Heuristics

Yuta Koreeda, Terufumi Morishita, Osamu Imaichi, Yasuhiro Sogawa

PDF

1 Repo

TL;DR

LARCH is a system that automatically generates accurate and coherent readme files for software repositories by identifying representative code fragments using heuristics and weak supervision, improving over baseline methods.

Contribution

The paper introduces LARCH, a novel approach that leverages heuristics and weak supervision to identify representative code for automatic readme generation using large language models.

Findings

01

LARCH outperforms baseline methods in generating coherent readmes.

02

Human and automated evaluations confirm LARCH's factual correctness.

03

Open-source implementation with VS Code and CLI interfaces.

Abstract

Writing a readme is a crucial aspect of software development as it plays a vital role in managing and reusing program code. Though it is a pain point for many developers, automatically creating one remains a challenge even with the recent advancements in large language models (LLMs), because it requires generating an abstract description from thousands of lines of code. In this demo paper, we show that LLMs are capable of generating a coherent and factually correct readmes if we can identify a code fragment that is representative of the repository. Building upon this finding, we developed LARCH (LLM-based Automatic Readme Creation with Heuristics) which leverages representative code identification with heuristics and weak supervision. Through human and automated evaluations, we illustrate that LARCH can generate coherent and factually correct readmes in the majority of cases,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hitachi-nlp/larch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.