CM2-Net: Continual Cross-Modal Mapping Network for Driver Action   Recognition

Ruoyu Wang; Chen Cai; Wenqian Wang; Jianjun Gao; Dan Lin; Wenyang Liu; and Kim-Hui Yap

arXiv:2406.11340·cs.CV·August 6, 2024

CM2-Net: Continual Cross-Modal Mapping Network for Driver Action Recognition

Ruoyu Wang, Chen Cai, Wenqian Wang, Jianjun Gao, Dan Lin, Wenyang Liu, and Kim-Hui Yap

PDF

Open Access

TL;DR

This paper introduces CM2-Net, a continual learning framework that effectively integrates new non-RGB modalities for driver action recognition by using instructive prompts from previously learned modalities, improving recognition accuracy.

Contribution

The paper proposes a novel Continual Cross-Modal Mapping Network with Accumulative Cross-modal Mapping Prompting to enhance multi-modal driver action recognition in continual learning settings.

Findings

01

CM2-Net outperforms existing methods on the Drive&Act dataset.

02

The approach effectively leverages prompts to incorporate new modalities.

03

Results show improved recognition accuracy for both uni- and multi-modal data.

Abstract

Driver action recognition has significantly advanced in enhancing driver-vehicle interactions and ensuring driving safety by integrating multiple modalities, such as infrared and depth. Nevertheless, compared to RGB modality only, it is always laborious and costly to collect extensive data for all types of non-RGB modalities in car cabin environments. Therefore, previous works have suggested independently learning each non-RGB modality by fine-tuning a model pre-trained on RGB videos, but these methods are less effective in extracting informative features when faced with newly-incoming modalities due to large domain gaps. In contrast, we propose a Continual Cross-Modal Mapping Network (CM2-Net) to continually learn each newly-incoming modality with instructive prompts from the previously-learned modalities. Specifically, we have developed Accumulative Cross-modal Mapping Prompting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Autonomous Vehicle Technology and Safety