SO(3)-invariant PCA with application to molecular data
Michael Fraiman, Paulina Hoyos, Tamir Bendory, Joe Kileel, Oscar Mickelin, Nir Sharon, and Amit Singer

TL;DR
This paper introduces an SO(3)-invariant PCA method for 3D molecular data that efficiently accounts for arbitrary orientations without extensive data augmentation, significantly reducing computational costs.
Contribution
The authors develop a novel SO(3)-invariant PCA framework that leverages algebraic structures to efficiently handle 3D data with unknown orientations.
Findings
Effective on real-world molecular datasets
Reduces computational complexity compared to naive methods
Enables large-scale 3D data analysis
Abstract
Principal component analysis (PCA) is a fundamental technique for dimensionality reduction and denoising; however, its application to three-dimensional data with arbitrary orientations -- common in structural biology -- presents significant challenges. A naive approach requires augmenting the dataset with many rotated copies of each sample, incurring prohibitive computational costs. In this paper, we extend PCA to 3D volumetric datasets with unknown orientations by developing an efficient and principled framework for SO(3)-invariant PCA that implicitly accounts for all rotations without explicit data augmentation. By exploiting underlying algebraic structure, we demonstrate that the computation involves only the square root of the total number of covariance entries, resulting in a substantial reduction in complexity. We validate the method on real-world molecular datasets, demonstrating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Tensor decomposition and applications · Topological and Geometric Data Analysis
