Incorporation of Human Knowledge into Data Embeddings to Improve Pattern Significance and Interpretability
Jie Li, Chun-qi Zhou

TL;DR
This paper introduces a method to incorporate human knowledge into data embeddings by using explicit labels and classification loss, enhancing the significance and interpretability of visualized data patterns.
Contribution
It proposes a novel embedding approach that externalizes human knowledge and integrates classification loss to produce more meaningful and interpretable data visualizations.
Findings
Improved pattern significance and interpretability in embeddings.
Effective integration of human knowledge into the embedding process.
Positive user study and quantitative results demonstrate approach's usability.
Abstract
Embedding is a common technique for analyzing multi-dimensional data. However, the embedding projection cannot always form significant and interpretable visual structures that foreshadow underlying data patterns. We propose an approach that incorporates human knowledge into data embeddings to improve pattern significance and interpretability. The core idea is (1) externalizing tacit human knowledge as explicit sample labels and (2) adding a classification loss in the embedding network to encode samples' classes. The approach pulls samples of the same class with similar data features closer in the projection, leading to more compact (significant) and class-consistent (interpretable) visual structures. We give an embedding network with a customized classification loss to implement the idea and integrate the network into a visualization system to form a workflow that supports flexible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Cell Image Analysis Techniques · Image Retrieval and Classification Techniques
