Two derivations of Principal Component Analysis on datasets of distributions
Vlad Niculae

TL;DR
This paper extends PCA to datasets of distributions, deriving a closed-form solution using variance maximization and reconstruction error minimization, thus broadening PCA's applicability to distributional data.
Contribution
It introduces a novel formulation of PCA for distributional datasets and provides two derivations, one based on variance maximization and another on reconstruction error minimization.
Findings
Closed-form solution for distributional PCA
Equivalent derivations from variance and reconstruction perspectives
Broadens PCA applicability to distributional data
Abstract
In this brief note, we formulate Principal Component Analysis (PCA) over datasets consisting not of points but of distributions, characterized by their location and covariance. Just like the usual PCA on points can be equivalently derived via a variance-maximization principle and via a minimization of reconstruction error, we derive a closed-form solution for distributional PCA from both of these perspectives.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Blind Source Separation Techniques · Spectroscopy and Chemometric Analyses
MethodsPrincipal Components Analysis
