TL;DR
This paper introduces an interactive data exploration system that uses an information-theoretic model to incorporate user feedback, enabling more effective discovery of data patterns without requiring mathematical expertise.
Contribution
It presents a novel theoretical framework and an open-source system that formalize user knowledge as constraints to guide data visualization and pattern discovery.
Findings
System allows efficient learning from data sources
Prototype performs well in practice
User feedback effectively guides exploration
Abstract
Visual exploration of high-dimensional real-valued datasets is a fundamental task in exploratory data analysis (EDA). Existing methods use predefined criteria to choose the representation of data. There is a lack of methods that (i) elicit from the user what she has learned from the data and (ii) show patterns that she does not know yet. We construct a theoretical model where identified patterns can be input as knowledge to the system. The knowledge syntax here is intuitive, such as "this set of points forms a cluster", and requires no knowledge of maths. This background knowledge is used to find a Maximum Entropy distribution of the data, after which the system provides the user data projections in which the data and the Maximum Entropy distribution differ the most, hence showing the user aspects of the data that are maximally informative given the user's current knowledge. We provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
