OSCAR: A Semantic-based Data Binning Approach
Vidya Setlur, Michael Correll, Sarah Battersby

TL;DR
OSCAR introduces a semantic-aware binning method that automatically selects data bins based on inferred semantic types, improving user preference over traditional statistical binning in visualizations.
Contribution
The paper presents OSCAR, a novel semantic-based binning approach leveraging survey and visualization data to enhance data categorization and visualization effectiveness.
Findings
Users prefer OSCAR-generated bins over traditional statistical bins.
OSCAR effectively infers semantic categories for data binning.
Crowdsourced study validates improved user satisfaction with OSCAR.
Abstract
Binning is applied to categorize data values or to see distributions of data. Existing binning algorithms often rely on statistical properties of data. However, there are semantic considerations for selecting appropriate binning schemes. Surveys, for instance, gather respondent data for demographic-related questions such as age, salary, number of employees, etc., that are bucketed into defined semantic categories. In this paper, we leverage common semantic categories from survey data and Tableau Public visualizations to identify a set of semantic binning categories. We employ these semantic binning categories in OSCAR: a method for automatically selecting bins based on the inferred semantic type of the field. We conducted a crowdsourced study with 120 participants to better understand user preferences for bins generated by OSCAR vs. binning provided in Tableau. We find that maps and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Data-Driven Disease Surveillance · Human Mobility and Location-Based Analysis
