Cross-View Cross-Modal Unsupervised Domain Adaptation for Driver Monitoring System

Aditi Bhalla; Christian Hellert; Enkelejda Kasneci

arXiv:2511.12196·cs.CV·November 18, 2025

Cross-View Cross-Modal Unsupervised Domain Adaptation for Driver Monitoring System

Aditi Bhalla, Christian Hellert, Enkelejda Kasneci

PDF

Open Access

TL;DR

This paper introduces a two-phase unsupervised domain adaptation framework that jointly addresses cross-view and cross-modal challenges in driver monitoring, significantly improving activity recognition accuracy across diverse vehicle setups.

Contribution

The work presents a novel joint framework combining contrastive learning and information bottleneck for cross-view and cross-modal unsupervised domain adaptation in driver monitoring.

Findings

01

Improves top-1 accuracy on RGB video data by nearly 50%.

02

Outperforms existing unsupervised domain adaptation methods by up to 5%.

03

Effectively handles real-time driver activity recognition across diverse vehicle configurations.

Abstract

Driver distraction remains a leading cause of road traffic accidents, contributing to thousands of fatalities annually across the globe. While deep learning-based driver activity recognition methods have shown promise in detecting such distractions, their effectiveness in real-world deployments is hindered by two critical challenges: variations in camera viewpoints (cross-view) and domain shifts such as change in sensor modality or environment. Existing methods typically address either cross-view generalization or unsupervised domain adaptation in isolation, leaving a gap in the robust and scalable deployment of models across diverse vehicle configurations. In this work, we propose a novel two-phase cross-view, cross-modal unsupervised domain adaptation framework that addresses these challenges jointly on real-time driver monitoring data. In the first phase, we learn view-invariant and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Sleep and Work-Related Fatigue · Emotion and Mood Recognition