Monitoring the shape of weather, soundscapes, and dynamical systems: a new statistic for dimension-driven data analysis on large data sets
Henry Kvinge, Elin Farnell, Michael Kirby, Chris Peterson

TL;DR
This paper introduces the $5$-profile, a new statistic derived from dimensionality-reduction optimization, to analyze and monitor large data sets like weather, soundscapes, and dynamical systems by capturing their intrinsic dimensionality.
Contribution
The paper proposes the $5$-profile, a novel statistic based on secant-preserving projections, for effective analysis of large data sets' intrinsic dimensions.
Findings
The $5$-profile effectively characterizes data complexity in weather, soundscape, and dynamical systems datasets.
Algorithms like Secant-Avoidance Projection enable feasible computation of the $5$-profile for large data.
The $5$-profile provides insights into data behavior and intrinsic dimension monitoring.
Abstract
Dimensionality-reduction methods are a fundamental tool in the analysis of large data sets. These algorithms work on the assumption that the "intrinsic dimension" of the data is generally much smaller than the ambient dimension in which it is collected. Alongside their usual purpose of mapping data into a smaller dimension with minimal information loss, dimensionality-reduction techniques implicitly or explicitly provide information about the dimension of the data set. In this paper, we propose a new statistic that we call the -profile for analysis of large data sets. The -profile arises from a dimensionality-reduction optimization problem: namely that of finding a projection into -dimensions that optimally preserves the secants between points in the data set. From this optimal projection we extract the norm of the shortest projected secant from among the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
