Visualising category recoding and numeric redistributions
Cynthia A. Huang

TL;DR
This paper introduces graphical tools for visualizing and auditing data transformations between different taxonomies, focusing on category recoding and numeric redistribution in data harmonization workflows.
Contribution
It proposes a new task abstraction called cross-taxonomy transformation and a graph-based structure called crossmap for visualizing these transformations.
Findings
Illustrates crossmap with a country-to-international data conversion
Discusses visualization opportunities for auditing transformations
Highlights challenges in visualizing complex data recodings
Abstract
This paper proposes graphical representations of data and rationale provenance in workflows that convert both category labels and associated numeric data between distinct but semantically related taxonomies. We motivate the graphical representations with a new task abstraction, the cross-taxonomy transformation, and associated graph-based information structure, the crossmap. The task abstraction supports the separation of category recoding and numeric redistribution decisions from the specifics of data manipulation in ex-post data harmonisation. The crossmap structure is illustrated using an example conversion of numeric statistics from a country-specific taxonomy to an international classification standard. We discuss the opportunities and challenges of using visualisation to audit and communicate cross-taxonomy transformations and present candidate graphical representations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Research Data Management Practices · Semantic Web and Ontologies
