Predicting the Understandability of Computational Notebooks through Code Metrics Analysis
Mojtaba Mostafavi Ghahfarokhi, Alireza Asadi, Arash Asgari, Bardia Mohammadi, Abbas Heydarnoori, Masih Beigi Rizi

TL;DR
This paper introduces a novel method for assessing the understandability of Jupyter notebooks by analyzing user comments and code metrics, achieving high accuracy with machine learning models.
Contribution
It proposes a new user opinion-based metric, UOCU, and demonstrates its effectiveness in predicting notebook understandability using machine learning.
Findings
UOCU outperforms prior assessment methods
Random Forest achieved 89% accuracy
Combining UOCU with upvotes improves predictions
Abstract
Computational notebooks are the primary coding tools for data scientists, but their code quality remains understudied and often poor. Given the importance of maintainability and reusability, enhancing code understandability is essential. Traditional methods for assessing understandability typically rely on limited questionnaires or metadata like likes and votes, which may not reflect actual code clarity. To address this, we propose a novel approach that leverages user opinions from software repositories to assess the understandability of Jupyter notebooks. We conducted a case study using 542,051 Kaggle Jupyter notebooks compiled in the DistilKaggle dataset. To identify user comments related to code understandability, we used a fine-tuned DistilBERT transformer. We then introduced a new metric, i.e., User Opinion Code Understandability (UOCU), based on the number of relevant comments,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Statistics Education and Methodologies · Educational Assessment and Improvement
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece · Residual Connection · Softmax · Layer Normalization · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Dropout
