MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast   and Accurate Inference on Missing Modality Sequences

Wei Han; Hui Chen; Min-Yen Kan; Soujanya Poria

arXiv:2210.12798·cs.CL·October 25, 2022·1 cites

MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences

Wei Han, Hui Chen, Min-Yen Kan, Soujanya Poria

PDF

Open Access 1 Repo

TL;DR

MM-Align introduces a novel optimal transport-based method for fast, accurate inference in multimodal tasks with missing modalities, improving imputation and reducing overfitting.

Contribution

It proposes a new alignment dynamics learning approach using optimal transport and a denoising training algorithm for missing modality inference.

Findings

01

Outperforms previous methods in accuracy and speed

02

Effective in various missing modality scenarios

03

Reduces overfitting in multimodal inference

Abstract

Existing multimodal tasks mostly target at the complete input modality setting, i.e., each modality is either complete or completely missing in both training and test sets. However, the randomly missing situations have still been underexplored. In this paper, we present a novel approach named MM-Align to address the missing-modality inference problem. Concretely, we propose 1) an alignment dynamics learning module based on the theory of optimal transport (OT) for indirect missing data imputation; 2) a denoising training algorithm to simultaneously enhance the imputation results and backbone network performance. Compared with previous methods which devote to reconstructing the missing inputs, MM-Align learns to capture and imitate the alignment dynamics between modality sequences. Results of comprehensive experiments on three datasets covering two multimodal tasks empirically demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

declare-lab/mm-align
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis · Multimodal Machine Learning Applications

MethodsTest