Exploring the Impact of Code Style in Identifying Good Programmers
Rafed Muhammad Yasir, Ahmedul Kabir

TL;DR
This paper investigates whether code style can serve as an indicator of programmer quality by analyzing data from Google Code Jam with machine learning models, finding that stylistic features can predict good programmers.
Contribution
It is the first study to explore the use of code style as a predictor of programmer quality using machine learning techniques.
Findings
Good programmers can be identified using stylistic features.
No specific style groups are associated with good programmers.
Supervised models achieve meaningful prediction performance.
Abstract
Code style is an aesthetic choice exhibited in source code that reflects programmers individual coding habits. This study is the first to investigate whether code style can be used as an indicator to identify good programmers. Data from Google Code Jam was chosen for conducting the study. A cluster analysis was performed to find whether a particular coding style could be associated with good programmers. Furthermore, supervised machine learning models were trained using stylistic features and evaluated using recall, macro-F1, AUC-ROC and balanced accuracy to predict good programmers. The results demonstrate that good programmers may be identified using supervised machine learning models, despite that no particular style groups could be attributed as a good style.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research
