Large-scale Data Integration using Matrix Denoising and Geometric Factor Matching
Felix Held

TL;DR
This paper presents lsCMF, a fast and scalable method for large-scale data integration that employs matrix denoising and geometric factor matching, suitable for flexible data layouts.
Contribution
The paper introduces a novel approach to matrix integration as a graph estimation problem, enabling rapid and flexible data integration with high structural accuracy.
Findings
lsCMF achieves high estimation speed in simulations.
The method effectively integrates data with flexible layouts.
It maintains good data structure estimation accuracy.
Abstract
Unsupervised integrative analysis of multiple data sources has become common place and scalable algorithms are necessary to accommodate ever increasing availability of data. Only few currently methods have estimation speed as their focus, and those that do are only applicable to restricted data layouts such as different data types measured on the same observation units. We introduce a novel point of view on low-rank matrix integration phrased as a graph estimation problem which allows development of a method, large-scale Collective Matrix Factorization (lsCMF), which is able to integrate data in flexible layouts in a speedy fashion. It utilizes a matrix denoising framework for rank estimation and geometric properties of singular vectors to efficiently integrate data. The quick estimation speed of lsCMF while retaining good estimation of data structure is then demonstrated in simulation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Neural Networks and Applications
