Automatic Pull Request Title Generation
Ting Zhang, Ivana Clairine Irsan, Ferdian Thung, DongGyun Han, David, Lo, Lingxiao Jiang

TL;DR
This paper introduces the task of automatically generating pull request titles using summarization techniques, constructs a large dataset, and demonstrates that BART outperforms other models in generating high-quality PR titles.
Contribution
It formulates PR title generation as a summarization task, creates a large dataset, and evaluates state-of-the-art models, showing BART's superior performance.
Findings
BART achieves ROUGE-1 of 47.22, ROUGE-2 of 25.27, ROUGE-L of 43.12.
Generated titles are preferred in manual evaluations.
The dataset contains 43,816 PRs from 495 repositories.
Abstract
Pull Requests (PRs) are a mechanism on modern collaborative coding platforms, such as GitHub. PRs allow developers to tell others that their code changes are available for merging into another branch in a repository. A PR needs to be reviewed and approved by the core team of the repository before the changes are merged into the branch. Usually, reviewers need to identify a PR that is in line with their interests before providing a review. By default, PRs are arranged in a list view that shows the titles of PRs. Therefore, it is desirable to have a precise and concise title, which is beneficial for both reviewers and other developers. However, it is often the case that developers do not provide good titles; we find that many existing PR titles are either inappropriate in length (i.e., too short or too long) or fail to convey useful information, which may result in PR being ignored or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Wikis in Education and Collaboration
