New Approaches to Principal Component Analysis for Trees
Burcu Ayd{\i}n, G\'abor Pataki, Haonan Wang, Alim Ladha, Elizabeth, Bullitt, J.S. Marron

TL;DR
This paper extends PCA methods to analyze populations of tree-structured data objects, introducing k-tree-lines and tree-curves that capture more variation efficiently in brain vessel structure data.
Contribution
It proposes a generalization of tree-line PCA to k-tree-lines and tree-curves, improving data variation explanation and computational efficiency in tree-structured data analysis.
Findings
2-tree-lines and tree-curves explain more data variation than traditional tree-lines.
Optimal tree-line computation is linear in time complexity.
Enhanced analysis of brain vessel structures using new PCA methods.
Abstract
Object Oriented Data Analysis is a new area in statistics that studies populations of general data objects. In this article we consider populations of tree-structured objects as our focus of interest. We develop improved analysis tools for data lying in a binary tree space analogous to classical Principal Component Analysis methods in Euclidean space. Our extensions of PCA are analogs of one dimensional subspaces that best fit the data. Previous work was based on the notion of tree-lines. In this paper, a generalization of the previous tree-line notion is proposed: k-tree-lines. Previously proposed tree-lines are k-tree-lines where k=1. New sub-cases of k-tree-lines studied in this work are the 2-tree-lines and tree-curves, which explain much more variation per principal component than tree-lines. The optimal principal component tree-lines were computable in linear time. Because…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Graph Theory and Algorithms · Topological and Geometric Data Analysis
