A Large-Scale Study on Source Code Reviewer Recommendation

Jakub Lipcak; Bruno Rossi

arXiv:1806.07619·cs.SE·October 29, 2018

A Large-Scale Study on Source Code Reviewer Recommendation

Jakub Lipcak, Bruno Rossi

PDF

1 Repo

TL;DR

This large-scale study compares two source code reviewer recommendation algorithms across numerous projects, revealing that no single model is best universally and that repository-specific factors influence recommendation effectiveness.

Contribution

The paper provides a comprehensive comparison of RevFinder and a Naive Bayes approach on a large dataset, highlighting the impact of repository differences and sub-project information on recommendation performance.

Findings

01

No model is best for all projects.

02

Repository type affects recommendation results.

03

Using sub-project info improves recommendations.

Abstract

Context: Software code reviews are an important part of the development process, leading to better software quality and reduced overall costs. However, finding appropriate code reviewers is a complex and time-consuming task. Goals: In this paper, we propose a large-scale study to compare performance of two main source code reviewer recommendation algorithms (RevFinder and a Naive Bayes-based approach) in identifying the best code reviewers for opened pull requests. Method: We mined data from Github and Gerrit repositories, building a large dataset of 51 projects, with more than 293K pull requests analyzed, 180K owners and 157K reviewers. Results: Based on the large analysis, we can state that i) no model can be generalized as best for all projects, ii) the usage of a different repository (Gerrit, GitHub) can have impact on the the recommendation results, iii) exploiting sub-projects…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

XLipcak/rev-rec
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.