Aggregation of Dependent Expert Distributions in Multimodal Variational   Autoencoders

Rogelio A Mancisidor; Robert Jenssen; Shujian Yu; Michael Kampffmeyer

arXiv:2505.01134·cs.LG·May 5, 2025

Aggregation of Dependent Expert Distributions in Multimodal Variational Autoencoders

Rogelio A Mancisidor, Robert Jenssen, Shujian Yu, Michael Kampffmeyer

PDF

Open Access

TL;DR

This paper introduces CoDE-VAE, a novel multimodal VAE approach that effectively aggregates dependent expert distributions, improving joint likelihood estimation, generative coherence, and classification accuracy over existing methods.

Contribution

It proposes a new aggregation method for dependent experts in multimodal VAEs, overcoming independence assumptions and enhancing performance.

Findings

01

CoDE-VAE outperforms existing methods in log-likelihood estimation.

02

It maintains high generative quality as the number of modalities increases.

03

Achieves classification accuracy comparable to state-of-the-art models.

Abstract

Multimodal learning with variational autoencoders (VAEs) requires estimating joint distributions to evaluate the evidence lower bound (ELBO). Current methods, the product and mixture of experts, aggregate single-modality distributions assuming independence for simplicity, which is an overoptimistic assumption. This research introduces a novel methodology for aggregating single-modality distributions by exploiting the principle of consensus of dependent experts (CoDE), which circumvents the aforementioned assumption. Utilizing the CoDE method, we propose a novel ELBO that approximates the joint likelihood of the multimodal data by learning the contribution of each subset of modalities. The resulting CoDE-VAE model demonstrates better performance in terms of balancing the trade-off between generative coherence and generative quality, as well as generating more precise log-likelihood…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Research Methodologies and Applications · Computational and Text Analysis Methods · Neural Networks and Applications