Error Identification Strategies for Python Jupyter Notebooks

Derek Robinson; Neil A. Ernst; Enrique Larios Vargas; Margaret-Anne D.; Storey

arXiv:2203.16653·cs.SE·July 20, 2022

Error Identification Strategies for Python Jupyter Notebooks

Derek Robinson, Neil A. Ernst, Enrique Larios Vargas, Margaret-Anne D., Storey

PDF

TL;DR

This study explores how Python Jupyter notebook users identify and understand errors, revealing that despite the unique environment, debugging strategies are similar to traditional programming, with implications for tool design and education.

Contribution

It provides an observational analysis of error detection strategies in Python Jupyter notebooks, extending prior research from R notebooks to Python environments.

Findings

01

Debugging strategies in Jupyter notebooks are similar to traditional programming environments.

02

Users employ domain knowledge, statistical understanding, and programming skills to find errors.

03

Insights can inform improvements in notebook tools and educational practices.

Abstract

Computational notebooks -- such as Jupyter or Colab -- combine text and data analysis code. They have become ubiquitous in the world of data science and exploratory data analysis. Since these notebooks present a different programming paradigm than conventional IDE-driven programming, it is plausible that debugging in computational notebooks might also be different. More specifically, since creating notebooks blends domain knowledge, statistical analysis, and programming, the ways in which notebook users find and fix errors in these different forms might be different. In this paper, we present an exploratory, observational study on how Python Jupyter notebook users find and understand potential errors in notebooks. Through a conceptual replication of study design investigating the error identification strategies of R notebook users, we presented users with Python Jupyter notebooks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.