Median K-flats for hybrid linear modeling with many outliers
Teng Zhang, Arthur Szlam, Gilad Lerman

TL;DR
The paper introduces Median K-Flats, an online algorithm for hybrid linear modeling that efficiently partitions data into clusters and fits linear subspaces, robustly handling outliers with minimal storage and computational complexity.
Contribution
It presents a novel online median K-flats algorithm for hybrid linear modeling that is simple, efficient, and robust to outliers, with proven empirical performance.
Findings
Efficiently models data with multiple linear subspaces.
Handles outliers effectively using median-based optimization.
Operates incrementally with low storage requirements.
Abstract
We describe the Median K-Flats (MKF) algorithm, a simple online method for hybrid linear modeling, i.e., for approximating data by a mixture of flats. This algorithm simultaneously partitions the data into clusters while finding their corresponding best approximating l1 d-flats, so that the cumulative l1 error is minimized. The current implementation restricts d-flats to be d-dimensional linear subspaces. It requires a negligible amount of storage, and its complexity, when modeling data consisting of N points in D-dimensional Euclidean space with K d-dimensional linear subspaces, is of order O(n K d D+n d^2 D), where n is the number of iterations required for convergence (empirically on the order of 10^4). Since it is an online algorithm, data can be supplied to it incrementally and it can incrementally produce the corresponding output. The performance of the algorithm is carefully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
