How Readable is Model-generated Code? Examining Readability and Visual Inspection of GitHub Copilot
Naser Al Madi

TL;DR
This study evaluates the readability and visual inspection of GitHub Copilot generated code, finding it comparable to human-written code in complexity but with less visual attention from programmers, raising concerns about automation bias.
Contribution
The paper provides empirical evidence on the readability and visual attention differences between model-generated and human-written code, highlighting potential risks of automation bias.
Findings
Model generated code has similar readability and complexity to human code.
Programmers pay less visual attention to model generated code.
Automation bias may lead to complacency in code review.
Abstract
Background: Recent advancements in large language models have motivated the practical use of such models in code generation and program synthesis. However, little is known about the effects of such tools on code readability and visual attention in practice. Objective: In this paper, we focus on GitHub Copilot to address the issues of readability and visual inspection of model generated code. Readability and low complexity are vital aspects of good source code, and visual inspection of generated code is important in light of automation bias. Method: Through a human experiment (n=21) we compare model generated code to code written completely by human programmers. We use a combination of static code analysis and human annotators to assess code readability, and we use eye tracking to assess the visual inspection of code. Results: Our results suggest that model generated code is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
