TL;DR
This paper introduces a machine learning-based multimodal fusion system for referencing outside objects from a moving vehicle, demonstrating improved accuracy and personalization capabilities in simulated driving scenarios.
Contribution
It presents a novel multimodal fusion approach for vehicle-object referencing and a transfer-of-learning personalization technique for individual driver adaptation.
Findings
Multimodal fusion outperforms single-modality methods.
Personalization improves referencing accuracy for individual drivers.
The approach is effective in simulated driving environments.
Abstract
Over the past decades, the addition of hundreds of sensors to modern vehicles has led to an exponential increase in their capabilities. This allows for novel approaches to interaction with the vehicle that go beyond traditional touch-based and voice command approaches, such as emotion recognition, head rotation, eye gaze, and pointing gestures. Although gaze and pointing gestures have been used before for referencing objects inside and outside vehicles, the multimodal interaction and fusion of these gestures have so far not been extensively studied. We propose a novel learning-based multimodal fusion approach for referencing outside-the-vehicle objects while maintaining a long driving route in a simulated environment. The proposed multimodal approaches outperform single-modality approaches in multiple aspects and conditions. Moreover, we also demonstrate possible ways to exploit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
