Intrinsic dimension estimation of data by principal component analysis
Mingyu Fan, Nannan Gu, Hong Qiao, Bo Zhang

TL;DR
This paper introduces a novel PCA-based approach for estimating the intrinsic dimension of data with nonlinear structures, utilizing local PCA on data covers to improve accuracy and noise filtering.
Contribution
A new PCA-based method for intrinsic dimension estimation that handles nonlinear data structures and supports incremental learning.
Findings
Effective on synthetic and real data sets
Filters out noise and converges with larger neighborhoods
Works incrementally on large data sets
Abstract
Estimating intrinsic dimensionality of data is a classic problem in pattern recognition and statistics. Principal Component Analysis (PCA) is a powerful tool in discovering dimensionality of data sets with a linear structure; it, however, becomes ineffective when data have a nonlinear structure. In this paper, we propose a new PCA-based method to estimate intrinsic dimension of data with nonlinear structures. Our method works by first finding a minimal cover of the data set, then performing PCA locally on each subset in the cover and finally giving the estimation result by checking up the data variance on all small neighborhood regions. The proposed method utilizes the whole data set to estimate its intrinsic dimension and is convenient for incremental learning. In addition, our new PCA procedure can filter out noise in data and converge to a stable estimation with the neighborhood…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Chaos control and synchronization · Statistical and numerical algorithms
