Two derivations of Principal Component Analysis on datasets of   distributions

Vlad Niculae

arXiv:2306.13503·stat.ML·June 26, 2023·1 cites

Two derivations of Principal Component Analysis on datasets of distributions

Vlad Niculae

PDF

Open Access

TL;DR

This paper extends PCA to datasets of distributions, deriving a closed-form solution using variance maximization and reconstruction error minimization, thus broadening PCA's applicability to distributional data.

Contribution

It introduces a novel formulation of PCA for distributional datasets and provides two derivations, one based on variance maximization and another on reconstruction error minimization.

Findings

01

Closed-form solution for distributional PCA

02

Equivalent derivations from variance and reconstruction perspectives

03

Broadens PCA applicability to distributional data

Abstract

In this brief note, we formulate Principal Component Analysis (PCA) over datasets consisting not of points but of distributions, characterized by their location and covariance. Just like the usual PCA on points can be equivalently derived via a variance-maximization principle and via a minimization of reconstruction error, we derive a closed-form solution for distributional PCA from both of these perspectives.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Signal Denoising Methods · Blind Source Separation Techniques · Spectroscopy and Chemometric Analyses

MethodsPrincipal Components Analysis