Multimodal Factor Analysis
Yasin Yilmaz, Alfred O. Hero

TL;DR
This paper introduces a multimodal factor analysis model that integrates diverse data types through shared latent factors, enabling effective dimensionality reduction and clustering, demonstrated on Twitter hashtag data.
Contribution
The paper proposes a novel generative graphical model for multimodal data with an EM algorithm for parameter estimation, extending to von Mises-Fisher distribution for spherical data.
Findings
Successfully localizes hashtags across multiple modalities
Achieves effective dimensionality reduction and clustering
Extends model to spherical coordinate data
Abstract
A multimodal system with Poisson, Gaussian, and multinomial observations is considered. A generative graphical model that combines multiple modalities through common factor loadings is proposed. In this model, latent factors are like summary objects that has latent factor scores in each modality, and the observed objects are represented in terms of such summary objects. This potentially brings about a significant dimensionality reduction. It also naturally enables a powerful means of clustering based on a diverse set of observations. An expectation-maximization (EM) algorithm to find the model parameters is provided. The algorithm is tested on a Twitter dataset which consists of the counts and geographical coordinates of hashtag occurrences, together with the bag of words for each hashtag. The resultant factors successfully localizes the hashtags in all dimensions: counts, coordinates,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
