Foundations of Multisensory Artificial Intelligence

Paul Pu Liang

arXiv:2404.18976·cs.LG·May 1, 2024·1 cites

Foundations of Multisensory Artificial Intelligence

Paul Pu Liang

PDF

Open Access 1 Video

TL;DR

This paper develops theoretical frameworks and practical models for multisensory AI, integrating multiple modalities to improve understanding and application in fields like healthcare, robotics, and multimedia processing.

Contribution

It introduces a formal framework for modality interactions, a large-scale multisensory benchmark, and multimodal architectures that advance the development of general-purpose multisensory AI systems.

Findings

01

Quantification of modality interactions aids dataset understanding.

02

MultiBench benchmark enables comprehensive evaluation across modalities.

03

Multimodal transformers facilitate scalable multisensory AI applications.

Abstract

Building multisensory AI systems that learn from multiple sensory inputs such as text, speech, video, real-world sensors, wearable devices, and medical data holds great promise for impact in many scientific areas with practical benefits, such as in supporting human health and well-being, enabling multimedia content processing, and enhancing real-world autonomous agents. By synthesizing a range of theoretical frameworks and application domains, this thesis aims to advance the machine learning foundations of multisensory AI. In the first part, we present a theoretical framework formalizing how modalities interact with each other to give rise to new information for a task. These interactions are the basic building blocks in all multimodal problems, and their quantification enables users to understand their multimodal datasets, design principled approaches to learn these interactions, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Foundations of Multisensory Artificial Intelligence· underline

Taxonomy

TopicsAdvanced Computational Techniques in Science and Engineering