An Industrial Case Study on Shrinking Code Review Changesets through Remark Prediction
Tobias Baum, Steffen Herbold, Kurt Schneider

TL;DR
This study explores a predictive approach to identify and omit non-essential code review parts, reducing review effort by about 25% with minimal missed remarks, through data-driven models and novel trigger tracing algorithms.
Contribution
It introduces a new method for predicting review importance using data mining, a novel trigger tracing algorithm, and provides practical insights from an industrial case study.
Findings
Able to skip 25% of change parts in reviews
Missed only about 1% of review remarks
Highlights limitations of syntactic rules and data noise
Abstract
Change-based code review is used widely in industrial software development. Thus, research on tools that help the reviewer to achieve better review performance can have a high impact. We analyze one possibility to provide cognitive support for the reviewer: Determining the importance of change parts for review, specifically determining which parts of the code change can be left out from the review without harm. To determine the importance of change parts, we extract data from software repositories and build prediction models for review remarks based on this data. The approach is discussed in detail. To gather the input data, we propose a novel algorithm to trace review remarks to their triggers. We apply our approach in a medium-sized software company. In this company, we can avoid the review of 25% of the change parts and of 23% of the changed Java source code lines, while missing only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software Testing and Debugging Techniques
