Combining Program Analysis and Statistical Language Model for Code   Statement Completion

Son Nguyen; Tien N. Nguyen; Yi Li; Shaohua Wang

arXiv:1911.07781·cs.SE·November 19, 2019·1 cites

Combining Program Analysis and Statistical Language Model for Code Statement Completion

Son Nguyen, Tien N. Nguyen, Yi Li, Shaohua Wang

PDF

Open Access

TL;DR

AutoSC is a novel approach that combines program analysis and language modeling to improve code statement completion, achieving significant accuracy improvements over existing methods.

Contribution

It introduces AutoSC, which integrates program analysis with statistical language models for more accurate code statement completion.

Findings

01

AutoSC achieves 38.9-41.3% top-1 accuracy.

02

AutoSC outperforms state-of-the-art by 9X-69X in top-1 accuracy.

03

AutoSC effectively combines code validity and naturalness for better predictions.

Abstract

Automatic code completion helps improve developers' productivity in their programming tasks. A program contains instructions expressed via code statements, which are considered as the basic units of program execution. In this paper, we introduce AutoSC, which combines program analysis and the principle of software naturalness to fill in a partially completed statement. AutoSC benefits from the strengths of both directions, in which the completed code statement is both frequent and valid. AutoSC is first trained on a large code corpus to derive the templates of candidate statements. Then, it uses program analysis to validate and concretize the templates into syntactically and type-valid candidate statements. Finally, these candidates are ranked by using a language model trained on the lexical form of the source code in the code corpus. Our empirical evaluation on the large datasets of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software System Performance and Reliability