TL;DR
The paper introduces the Sequencer algorithm, a novel method for identifying main trends in datasets by analyzing graph elongation, which outperforms existing techniques like t-SNE and UMAP in various scientific applications.
Contribution
The Sequencer algorithm provides a generic, parameter-free approach to detecting main trends in data, enhancing exploratory analysis across multiple scientific fields.
Findings
Outperforms t-SNE and UMAP in real-world datasets
Works without training or parameter tuning
Applicable across astronomy, geology, and image analysis
Abstract
Scientists aim to extract simplicity from observations of the complex world. An important component of this process is the exploration of data in search of trends. In practice, however, this tends to be more of an art than a science. Among all trends existing in the natural world, one-dimensional trends, often called sequences, are of particular interest as they provide insights into simple phenomena. However, some are challenging to detect as they may be expressed in complex manners. We present the Sequencer, an algorithm designed to generically identify the main trend in a dataset. It does so by constructing graphs describing the similarities between pairs of observations, computed with a set of metrics and scales. Using the fact that continuous trends lead to more elongated graphs, the algorithm can identify which aspects of the data are relevant in establishing a global sequence.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
