Senatus -- A Fast and Accurate Code-to-Code Recommendation Engine

Fran Silavong; Sean Moran; Antonios Georgiadis; Rohan Saphal; Robert; Otter

arXiv:2111.04473·cs.SE·April 27, 2022

Senatus -- A Fast and Accurate Code-to-Code Recommendation Engine

Fran Silavong, Sean Moran, Antonios Georgiadis, Rohan Saphal, Robert, Otter

PDF

TL;DR

Senatus is a novel code-to-code recommendation engine that significantly improves retrieval speed and recommendation quality by using a new LSH algorithm and addressing code snippet length skewness.

Contribution

It introduces De-Skew LSH, a new locality sensitive hashing method that enables fast, scalable, and more accurate code snippet recommendations by accounting for snippet length distribution.

Findings

01

Senatus improves F1 score by 31.21% over baseline.

02

Senatus achieves 147.9x faster query times than Facebook Aroma.

03

Senatus outperforms MinHash LSH by 29.2% in F1 score.

Abstract

Machine learning on source code (MLOnCode) is a popular research field that has been driven by the availability of large-scale code repositories and the development of powerful probabilistic and deep learning models for mining source code. Code-to-code recommendation is a task in MLOnCode that aims to recommend relevant, diverse and concise code snippets that usefully extend the code currently being written by a developer in their development environment (IDE). Code-to-code recommendation engines hold the promise of increasing developer productivity by reducing context switching from the IDE and increasing code-reuse. Existing code-to-code recommendation engines do not scale gracefully to large codebases, exhibiting a linear growth in query time as the code repository increases in size. In addition, existing code-to-code recommendation engines fail to account for the global statistics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.