Understanding Unimodal Bias in Multimodal Deep Linear Networks

Yedi Zhang; Peter E. Latham; Andrew Saxe

arXiv:2312.00935·cs.LG·July 30, 2024·1 cites

Understanding Unimodal Bias in Multimodal Deep Linear Networks

Yedi Zhang, Peter E. Latham, Andrew Saxe

PDF

Open Access 1 Repo

TL;DR

This paper develops a theoretical framework to understand unimodal bias in multimodal deep linear networks, revealing how architecture and data influence the duration of unimodal reliance during training, with implications for generalization.

Contribution

First to analytically characterize the unimodal bias phase duration in multimodal deep linear networks based on fusion layer depth, data, and initialization.

Findings

01

Deeper fusion layers extend the unimodal phase duration.

02

Long unimodal phases can cause generalization issues and permanent bias.

03

Results extend to certain nonlinear network settings.

Abstract

Using multiple input streams simultaneously to train multimodal neural networks is intuitively advantageous but practically challenging. A key challenge is unimodal bias, where a network overly relies on one modality and ignores others during joint training. We develop a theory of unimodal bias with multimodal deep linear networks to understand how architecture and data statistics influence this bias. This is the first work to calculate the duration of the unimodal phase in learning as a function of the depth at which modalities are fused within the network, dataset statistics, and initialization. We show that the deeper the layer at which fusion occurs, the longer the unimodal phase. A long unimodal phase can lead to a generalization deficit and permanent unimodal bias in the overparametrized regime. Our results, derived for multimodal linear networks, extend to nonlinear networks in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yedizhang/unimodal-bias
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications