Index $t$-SNE: Tracking Dynamics of High-Dimensional Datasets with Coherent Embeddings
Ga\"elle Candel, David Naccache

TL;DR
This paper introduces a method to reuse and adapt existing t-SNE embeddings to track the evolution of high-dimensional datasets over time, preserving cluster positions and enabling dynamic analysis.
Contribution
The paper proposes a novel approach to reuse t-SNE embeddings for dynamic datasets, maintaining cluster positions and reducing computational complexity compared to re-embedding from scratch.
Findings
Effective tracking of cluster evolution in real-world datasets
Lower computational complexity for embedding slices of data
Facilitates monitoring of dataset dynamics over time
Abstract
-SNE is an embedding method that the data science community has widely Two interesting characteristics of t-SNE are the structure preservation property and the answer to the crowding problem, where all neighbors in high dimensional space cannot be represented correctly in low dimensional space. -SNE preserves the local neighborhood, and similar items are nicely spaced by adjusting to the local density. These two characteristics produce a meaningful representation, where the cluster area is proportional to its size in number, and relationships between clusters are materialized by closeness on the embedding. This algorithm is non-parametric, therefore two initializations of the algorithm would lead to two different embedding. In a forensic approach, analysts would like to compare two or more datasets using their embedding. An approach would be to learn a parametric model over an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Data Stream Mining Techniques · Anomaly Detection Techniques and Applications
