Into the Void: Mapping the Unseen Gaps in High Dimensional Data
Xinyu Zhang, Tyler Estro, Geoff Kuenning, Erez Zadok, Klaus Mueller

TL;DR
This paper introduces GapMiner, a visual analytics system with a novel Empty Space Search Algorithm to identify and exploit unseen gaps in high-dimensional data, improving configuration discovery for machine learning tasks.
Contribution
The paper presents a new pipeline combining visualization, user interaction, and deep learning to systematically explore and utilize voids in high-dimensional datasets, outperforming traditional random methods.
Findings
Method produces significantly better configurations than random approaches.
Interactive exploration enhances dataset quality and model training.
System effectively supports diverse machine learning objectives.
Abstract
We present a comprehensive pipeline, augmented by a visual analytics system named ``GapMiner'', that is aimed at exploring and exploiting untapped opportunities within the empty areas of high-dimensional datasets. Our approach begins with an initial dataset and then uses a novel Empty Space Search Algorithm (ESA) to identify the center points of these uncharted voids, which are regarded as reservoirs containing potentially valuable novel configurations. Initially, this process is guided by user interactions facilitated by GapMiner. GapMiner visualizes the Empty Space Configurations (ESC) identified by the search within the context of the data, enabling domain experts to explore and adjust ESCs using a linked parallel-coordinate display. These interactions enhance the dataset and contribute to the iterative training of a connected deep neural network (DNN). As the DNN trains, it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Big Data Technologies and Applications · Advanced Clustering Algorithms Research
MethodsVisual Analytics
