Training Multimodal Systems for Classification with Multiple Objectives

Jason Armitage; Shramana Thakur; Rishi Tripathi; Jens Lehmann; and; Maria Maleshkova

arXiv:2008.11450·cs.LG·October 27, 2020

Training Multimodal Systems for Classification with Multiple Objectives

Jason Armitage, Shramana Thakur, Rishi Tripathi, Jens Lehmann, and, Maria Maleshkova

PDF

Open Access

TL;DR

This paper proposes a novel multimodal learning framework using multiple objectives and variational inference to improve generalization and stability in classification tasks involving text and images.

Contribution

It introduces a multi-objective training approach with probabilistic regularization for multimodal neural networks, enhancing performance and robustness.

Findings

01

Reduced variance in training through regularization

02

Improved generalization on multimodal classification tasks

03

Stabilized performance with added neurons in layers

Abstract

We learn about the world from a diverse range of sensory information. Automated systems lack this ability as investigation has centred on processing information presented in a single form. Adapting architectures to learn from multiple modalities creates the potential to learn rich representations of the world - but current multimodal systems only deliver marginal improvements on unimodal approaches. Neural networks learn sampling noise during training with the result that performance on unseen data is degraded. This research introduces a second objective over the multimodal fusion process learned with variational inference. Regularisation methods are implemented in the inner training loop to control variance and the modular structure stabilises performance as additional neurons are added to layers. This framework is evaluated on a multilabel classification task with textual and visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Neural Networks and Applications · Topic Modeling