Functional Principal Subspace Sampling for Large Scale Functional Data Analysis
Shiyuan He, Xiaomeng Yan

TL;DR
This paper introduces randomized algorithms for scalable functional principal component analysis and functional linear regression, utilizing a novel importance sampling method to efficiently estimate functional subspaces in large datasets.
Contribution
The paper proposes a functional principal subspace sampling method that improves scalability of FDA techniques by reducing computational costs while maintaining accuracy.
Findings
Algorithms effectively handle large datasets with low intrinsic dimension.
The importance sampling approach accurately estimates functional subspaces.
Experimental results demonstrate improved efficiency and accuracy on synthetic and real data.
Abstract
Functional data analysis (FDA) methods have computational and theoretical appeals for some high dimensional data, but lack the scalability to modern large sample datasets. To tackle the challenge, we develop randomized algorithms for two important FDA methods: functional principal component analysis (FPCA) and functional linear regression (FLR) with scalar response. The two methods are connected as they both rely on the accurate estimation of functional principal subspace. The proposed algorithms draw subsamples from the large dataset at hand and apply FPCA or FLR over the subsamples to reduce the computational cost. To effectively preserve subspace information in the subsamples, we propose a functional principal subspace sampling probability, which removes the eigenvalue scale effect inside the functional principal subspace and properly weights the residual. Based on the operator…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Advanced Statistical Methods and Models
