Data Fusion by Matrix Factorization
Marinka \v{Z}itnik, Bla\v{z} Zupan

TL;DR
This paper introduces a penalized matrix tri-factorization method for data fusion that integrates multiple heterogeneous data sources to improve prediction accuracy in biological tasks.
Contribution
The paper presents a novel matrix factorization approach for data fusion that can handle diverse data types and demonstrates its effectiveness in gene function and pharmacologic action prediction.
Findings
Outperforms alternative data integration methods.
Achieves higher accuracy than single data sources.
Effective across multiple biological datasets.
Abstract
For most problems in science and engineering we can obtain data sets that describe the observed system from various perspectives and record the behavior of its individual components. Heterogeneous data sets can be collectively mined by data fusion. Fusion can focus on a specific target relation and exploit directly associated data together with contextual data and data about system's constraints. In the paper we describe a data fusion approach with penalized matrix tri-factorization (DFMF) that simultaneously factorizes data matrices to reveal hidden associations. The approach can directly consider any data that can be expressed in a matrix, including those from feature-based representations, ontologies, associations and networks. We demonstrate the utility of DFMF for gene function prediction task with eleven different data sources and for prediction of pharmacologic actions by fusing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
