# Towards Automating Precision Studies of Clone Detectors

**Authors:** Vaibhav Saini, Farima Farmahinifarahani, Yadong Lu, Di Yang, Pedro, Martins, Hitesh Sajnani, Pierre Baldi, Cristina Lopes

arXiv: 1812.05195 · 2019-05-30

## TL;DR

This paper introduces a semi-automated method combining automatic clone classification with manual validation to improve precision studies of clone detectors, creating a valuable dataset for research.

## Contribution

It presents a novel semi-automated approach that enhances clone detection precision assessment and aggregates data from multiple teams into a comprehensive clone dataset.

## Key findings

- High precision of the automatic classification method
- Significant reduction in manual validation effort
- Creation of an evolving clone dataset

## Abstract

Current research in clone detection suffers from poor ecosystems for evaluating precision of clone detection tools. Corpora of labeled clones are scarce and incomplete, making evaluation labor intensive and idiosyncratic, and limiting inter tool comparison. Precision-assessment tools are simply lacking. We present a semi-automated approach to facilitate precision studies of clone detection tools. The approach merges automatic mechanisms of clone classification with manual validation of clone pairs. We demonstrate that the proposed automatic approach has a very high precision and it significantly reduces the number of clone pairs that need human validation during precision experiments. Moreover, we aggregate the individual effort of multiple teams into a single evolving dataset of labeled clone pairs, creating an important asset for software clone research.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.05195/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1812.05195/full.md

## References

7 references — full list in the complete paper: https://tomesphere.com/paper/1812.05195/full.md

---
Source: https://tomesphere.com/paper/1812.05195