The Categorical Data Map: A Multidimensional Scaling-Based Approach
Frederik L. Dennig, Lucas Joos, Patrick Paetzold, Daniela Blumberg,, Oliver Deussen, Daniel A. Keim, Maximilian T. Fischer

TL;DR
The paper introduces a new multidimensional scaling-based visualization technique for categorical data that enables similarity-based analysis and better scalability compared to traditional set-based methods.
Contribution
It presents a novel dimensionality reduction approach for categorical data visualization, including a prototype with enhanced encoding and measures for visual quality assessment.
Findings
Effective detection of data groups and attribute influence in visualizations
Superior scalability over Euler diagrams and Parallel Sets
Validated through expert study on large datasets with many categories
Abstract
Categorical data does not have an intrinsic definition of distance or order, and therefore, established visualization techniques for categorical data only allow for a set-based or frequency-based analysis, e.g., through Euler diagrams or Parallel Sets, and do not support a similarity-based analysis. We present a novel dimensionality reduction-based visualization for categorical data, which is based on defining the distance of two data items as the number of varying attributes. Our technique enables users to pre-attentively detect groups of similar data items and observe the properties of the projection, such as attributes strongly influencing the embedding. Our prototype visually encodes data properties in an enhanced scatterplot-like visualization, encoding attributes in the background to show the distribution of categories. In addition, we propose two graph-based measures to quantify…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Data Mining Algorithms and Applications
