Is spreadsheet syntax better than numeric indexing for cell selection?
Philip Heltweg, Dirk Riehle, Georg-Daniel Schwarz

TL;DR
This study compares spreadsheet-style cell selection syntax with numeric indexing in data engineering, finding that spreadsheet syntax improves speed and accuracy for practitioners, suggesting its potential for future tool development.
Contribution
It provides empirical evidence that spreadsheet syntax enhances correctness and efficiency over numeric indexing in cell selection tasks.
Findings
Participants made fewer mistakes with spreadsheet syntax.
Participants were faster when using spreadsheet syntax.
Spreadsheet syntax reduces errors in data selection tasks.
Abstract
Selecting a subset of cells is a common task in data engineering, for example, to remove errors or select only specific parts of a table. Multiple approaches to express this selection exist. One option is numeric indexing, commonly found in general programming languages, where a tuple of numbers identifies the cell. Alternatively, the separate dimensions can be referred to using different enumeration schemes like "A1" for the first cell, commonly found in software such as spreadsheet systems. In a large-scale controlled experiment with student participants as proxy for data practitioners, we compare the two options with respect to speed and correctness of reading and writing code. The results show that, when reading code, participants make less mistakes using spreadsheet-style syntax. Additionally, when writing code, they make fewer mistakes and are faster when using spreadsheet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpreadsheets and End-User Computing · Statistics Education and Methodologies · Software Engineering Research
