On the Generation, Structure, and Semantics of Grammar Patterns in   Source Code Identifiers

Christian D. Newman; Reem S. AlSuhaibani; Michael J. Decker; Anthony; Peruma; Dishant Kaushik; Mohamed Wiem Mkaouer; Emily Hill

arXiv:2007.08033·cs.SE·July 17, 2020

On the Generation, Structure, and Semantics of Grammar Patterns in Source Code Identifiers

Christian D. Newman, Reem S. AlSuhaibani, Michael J. Decker, Anthony, Peruma, Dishant Kaushik, Mohamed Wiem Mkaouer, Emily Hill

PDF

TL;DR

This paper analyzes naming patterns in source code identifiers using part-of-speech sequences, examining their structure, semantics, and how well current models can automatically identify these patterns to aid code comprehension.

Contribution

It establishes common naming patterns across identifier types, analyzes their impact on understanding code, and evaluates the accuracy of state-of-the-art POS tagging techniques for modeling identifiers.

Findings

01

Identified common naming patterns in class and attribute identifiers

02

Analyzed how patterns influence code comprehension

03

Evaluated POS tagging accuracy and its limitations

Abstract

Identifiers make up a majority of the text in code. They are one of the most basic mediums through which developers describe the code they create and understand the code that others create. Therefore, understanding the patterns latent in identifier naming practices and how accurately we are able to automatically model these patterns is vital if researchers are to support developers and automated analysis approaches in comprehending and creating identifiers correctly and optimally. This paper investigates identifiers by studying sequences of part-of-speech annotations, referred to as grammar patterns. This work advances our understanding of these patterns and our ability to model them by 1) establishing common naming patterns in different types of identifiers, such as class and attribute names; 2) analyzing how different patterns influence comprehension; and 3) studying the accuracy of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.