Investigating the Impact of Vocabulary Difficulty and Code Naturalness   on Program Comprehension

Bin Lin; Gregorio Robles

arXiv:2308.13429·cs.SE·August 28, 2023

Investigating the Impact of Vocabulary Difficulty and Code Naturalness on Program Comprehension

Bin Lin, Gregorio Robles

PDF

Open Access

TL;DR

This study investigates how vocabulary difficulty and code naturalness influence program comprehension, aiming to improve readability prediction by analyzing correlations with source code characteristics.

Contribution

It introduces a novel approach to assess code readability by examining vocabulary difficulty and naturalness, and explores their potential to enhance prediction models.

Findings

01

Code naturalness correlates with readability scores.

02

Vocabulary difficulty impacts understandability assessments.

03

Naturalness and vocabulary metrics can improve prediction accuracy.

Abstract

Context: Developers spend most of their time comprehending source code during software development. Automatically assessing how readable and understandable source code is can provide various benefits in different tasks, such as task triaging and code reviews. While several studies have proposed approaches to predict software readability and understandability, most of them only focus on local characteristics of source code. Besides, the performance of understandability prediction is far from satisfactory. Objective: In this study, we aim to assess readability and understandability from the perspective of language acquisition. More specifically, we would like to investigate whether code readability and understandability are correlated with the naturalness and vocabulary difficulty of source code. Method: To assess code naturalness, we adopted the cross-entropy metric, while we use a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software Engineering Techniques and Practices