MCRL4OR: Multimodal Contrastive Representation Learning for Off-Road Environmental Perception
Yi Yang, Zhang Zhang, Liang Wang

TL;DR
This paper introduces MCRL4OR, a multimodal contrastive learning framework that jointly learns visual, locomotion, and control features to improve off-road environmental perception for autonomous vehicles, addressing the lack of dense annotations.
Contribution
It proposes a novel contrastive learning approach that aligns locomotion states with visual and control features for off-road perception, a setting with limited annotated data.
Findings
Pre-trained multimodal representations enhance downstream perception tasks.
The approach outperforms existing methods in off-road scenarios.
Effective in unstructured off-road environments with limited annotations.
Abstract
Most studies on environmental perception for autonomous vehicles (AVs) focus on urban traffic environments, where the objects/stuff to be perceived are mainly from man-made scenes and scalable datasets with dense annotations can be used to train supervised learning models. By contrast, it is hard to densely annotate a large-scale off-road driving dataset manually due to the inherently unstructured nature of off-road environments. In this paper, we propose a Multimodal Contrastive Representation Learning approach for Off-Road environmental perception, namely MCRL4OR. This approach aims to jointly learn three encoders for processing visual images, locomotion states, and control actions by aligning the locomotion states with the fused features of visual images and control actions within a contrastive learning framework. The causation behind this alignment strategy is that the inertial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFire Detection and Safety Systems · Video Surveillance and Tracking Methods · Air Quality Monitoring and Forecasting
MethodsADaptive gradient method with the OPTimal convergence rate · Contrastive Learning · Focus
