What makes an Expert? Comparing Problem-solving Practices in Data Science Notebooks

Manuel Valle Torre; Marcus Specht; Catharine Oertel

arXiv:2602.15428·cs.CY·February 18, 2026

What makes an Expert? Comparing Problem-solving Practices in Data Science Notebooks

Manuel Valle Torre, Marcus Specht, Catharine Oertel

PDF

Open Access

TL;DR

This study empirically compares the problem-solving practices of data science experts and novices in Jupyter notebooks, revealing that expertise is characterized by workflow structure and iterative, efficient actions rather than different phase transitions.

Contribution

It introduces a multi-level sequence analysis of notebook actions to distinguish expert from novice problem-solving strategies in data science.

Findings

01

Experts use shorter, more iterative workflows.

02

Novices follow longer, linear processes.

03

Workflow structure and action patterns differentiate expertise.

Abstract

The development of data science expertise requires tacit, process-oriented skills that are difficult to teach directly. This study addresses the resulting challenge of empirically understanding how the problem-solving processes of experts and novices differ. We apply a multi-level sequence analysis to 440 Jupyter notebooks from a public dataset, mapping low-level coding actions to higher-level problem-solving practices. Our findings reveal that experts do not follow fundamentally different transitions between data science phases than novices (e.g., Data Import, EDA, Model Training, Visualization). Instead, expertise is distinguished by the overall workflow structure from a problem-solving perspective and cell-level, fine-grained action patterns. Novices tend to follow long, linear processes, whereas experts employ shorter, more iterative strategies enacted through efficient,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistics Education and Methodologies · Data Visualization and Analytics · Computational and Text Analysis Methods