Automatic Pull Request Description Generation Using LLMs: A T5 Model Approach
Md Nazmus Sakib, Md Athikul Islam, Md Mashrur Arifin

TL;DR
This paper presents an automated approach using a fine-tuned T5 model to generate pull request descriptions from commit messages and code comments, improving over baseline methods.
Contribution
It introduces a novel application of the T5 model for PR description generation, framing it as a text summarization task and demonstrating its effectiveness with a large dataset.
Findings
T5 model outperforms LexRank baseline in ROUGE metrics
Fine-tuning on 33,466 PRs improves description quality
Automated PR descriptions aid developers and reviewers
Abstract
Developers create pull request (PR) descriptions to provide an overview of their changes and explain the motivations behind them. These descriptions help reviewers and fellow developers quickly understand the updates. Despite their importance, some developers omit these descriptions. To tackle this problem, we propose an automated method for generating PR descriptions based on commit messages and source code comments. This method frames the task as a text summarization problem, for which we utilized the T5 text-to-text transfer model. We fine-tuned a pre-trained T5 model using a dataset containing 33,466 PRs. The model's effectiveness was assessed using ROUGE metrics, which are recognized for their strong alignment with human evaluations. Our findings reveal that the T5 model significantly outperforms LexRank, which served as our baseline for comparison.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsService-Oriented Architecture and Web Services · Advanced Computational Techniques and Applications · Software Engineering Research
MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Softmax · Dense Connections · Inverse Square Root Schedule · Dropout · Linear Layer · Attention Dropout
