Multimodal data integration and cross-modal querying via orchestrated approximate message passing
Sagnik Nandy, Zongming Ma

TL;DR
This paper introduces a data-driven approximate message passing algorithm for integrating multimodal data and constructing valid prediction sets for partially observed new subjects, demonstrated on synthetic and real datasets.
Contribution
It develops a novel orchestrated approximate message passing method for statistically optimal multimodal data integration and prediction in dependent multifactor models.
Findings
Achieves statistically optimal signal recovery in multimodal data integration.
Constructs asymptotically valid prediction sets for new subjects.
Demonstrates effectiveness on synthetic and real single-cell datasets.
Abstract
The need for multimodal data integration arises naturally when multiple complementary sets of features are measured on the same sample. Under a dependent multifactor model, we develop a fully data-driven orchestrated approximate message passing algorithm for integrating information across these feature sets to achieve statistically optimal signal recovery. In practice, these reference data sets are often queried later by new subjects that are only partially observed. Leveraging on asymptotic normality of estimates generated by our data integration method, we further develop an asymptotically valid prediction set for the latent representation of any such query subject. We demonstrate the prowess of both the data integration and the prediction set construction algorithms on both synthetic examples and real world single-cell datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Semantic Web and Ontologies · Web Data Mining and Analysis
