Buckaroo: A Direct Manipulation Visual Data Wrangler
Annabelle Warner, Andrew McNutt, Paul Rosen, El Kindi Rezig

TL;DR
Buckaroo is a visualization tool that helps data scientists identify, inspect, and correct data anomalies efficiently through direct visual manipulation, reducing manual effort and errors in data wrangling.
Contribution
It introduces an interactive system that automatically detects anomalies, suggests corrective actions, and supports visual data manipulation with undo/redo capabilities.
Findings
Automatically identifies interesting data groups with anomalies.
Provides recommended wrangling actions for correction.
Supports iterative visual data manipulation with undo/redo.
Abstract
Preparing datasets -- a critical phase known as data wrangling -- constitutes the dominant phase of data science development, consuming upwards of 80% of the total project time. This phase encompasses a myriad of tasks: parsing data, restructuring it for analysis, repairing inaccuracies, merging sources, eliminating duplicates, and ensuring overall data integrity. Traditional approaches, typically through manual coding in languages such as Python or using spreadsheets, are not only laborious but also error-prone. These issues range from missing entries and formatting inconsistencies to data type inaccuracies, all of which can affect the quality of downstream tasks if not properly corrected. To address these challenges, we present Buckaroo, a visualization system to highlight discrepancies in data and enable on-the-spot corrections through direct manipulations of visual objects. Buckaroo…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Digital Media Forensic Detection · Advanced Vision and Imaging
