MIPD: A Multi-sensory Interactive Perception Dataset for Embodied Intelligent Driving
Zhiwei Li, Tingzhen Zhang, Meihua Zhou, Dandan Tang, Pengwei Zhang,, Wenzhuo Liu, Qiaoning Yang, Tianyu Shen, Kunfeng Wang, and Huaping Liu

TL;DR
The paper introduces MIPD, a comprehensive multi-sensory dataset for autonomous driving that integrates traditional sensors with additional modalities like sound and vibration to enhance embodied intelligence.
Contribution
It presents a novel multi-sensory dataset for autonomous driving, including diverse sensor inputs beyond standard cameras and lidar, supporting research on embodied intelligent driving.
Findings
Dataset contains over 8,500 synchronized frames.
Includes challenging scenarios with varied conditions.
Validated through experimental analysis.
Abstract
During the process of driving, humans usually rely on multiple senses to gather information and make decisions. Analogously, in order to achieve embodied intelligence in autonomous driving, it is essential to integrate multidimensional sensory information in order to facilitate interaction with the environment. However, the current multi-modal fusion sensing schemes often neglect these additional sensory inputs, hindering the realization of fully autonomous driving. This paper considers multi-sensory information and proposes a multi-modal interactive perception dataset named MIPD, enabling expanding the current autonomous driving algorithm framework, for supporting the research on embodied intelligent driving. In addition to the conventional camera, lidar, and 4D radar data, our dataset incorporates multiple sensor inputs including sound, light intensity, vibration intensity and vehicle…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition
