# MFAS: Multimodal Fusion Architecture Search

**Authors:** Juan-Manuel P\'erez-R\'ua, Valentin Vielzeuf, St\'ephane Pateux, Moez, Baccouche, Fr\'ed\'eric Jurie

arXiv: 1903.06496 · 2019-03-18

## TL;DR

This paper introduces MFAS, a neural architecture search method for multimodal fusion, demonstrating its effectiveness in discovering high-performing fusion architectures across various datasets.

## Contribution

The paper presents a novel search space for multimodal fusion architectures and an efficient search method tailored for this problem, enabling automatic discovery of effective fusion strategies.

## Key findings

- Achieved state-of-the-art results on multiple multimodal datasets.
- Demonstrated the effectiveness of the search space and method on real datasets.
- Discovered fusion architectures that outperform existing methods.

## Abstract

We tackle the problem of finding good architectures for multimodal classification problems. We propose a novel and generic search space that spans a large number of possible fusion architectures. In order to find an optimal architecture for a given dataset in the proposed search space, we leverage an efficient sequential model-based exploration approach that is tailored for the problem. We demonstrate the value of posing multimodal fusion as a neural architecture search problem by extensive experimentation on a toy dataset and two other real multimodal datasets. We discover fusion architectures that exhibit state-of-the-art performance for problems with different domain and dataset size, including the NTU RGB+D dataset, the largest multi-modal action recognition dataset available.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.06496/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1903.06496/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/1903.06496/full.md

---
Source: https://tomesphere.com/paper/1903.06496