A Framework for Multisensory Foresight for Embodied Agents
Xiaohui Chen, Ramtin Hosseini, Karen Panetta, Jivko Sinapov

TL;DR
This paper introduces an unsupervised, multi-modal predictive neural network framework for embodied agents that integrates visual, haptic, audio, and tactile data to improve future sensory state prediction.
Contribution
It presents a novel multi-modal, unsupervised neural network architecture that predicts future sensory states across multiple modalities, enhancing environmental understanding for embodied agents.
Findings
Multi-modal integration improves prediction accuracy.
The framework effectively predicts future sensory signals in real-world tasks.
Using non-visual modalities enhances visual prediction performance.
Abstract
Predicting future sensory states is crucial for learning agents such as robots, drones, and autonomous vehicles. In this paper, we couple multiple sensory modalities with exploratory actions and propose a predictive neural network architecture to address this problem. Most existing approaches rely on large, manually annotated datasets, or only use visual data as a single modality. In contrast, the unsupervised method presented here uses multi-modal perceptions for predicting future visual frames. As a result, the proposed model is more comprehensive and can better capture the spatio-temporal dynamics of the environment, leading to more accurate visual frame prediction. The other novelty of our framework is the use of sub-networks dedicated to anticipating future haptic, audio, and tactile signals. The framework was tested and validated with a dataset containing 4 sensory modalities…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultisensory perception and integration · Olfactory and Sensory Function Studies · Tactile and Sensory Interactions
