MOSE: Boosting Vision-based Roadside 3D Object Detection with Scene Cues

Xiahan Chen; Mingjian Chen; Sanli Tang; Yi Niu; Jiang Zhu

arXiv:2404.05280·cs.CV·April 9, 2024·1 cites

MOSE: Boosting Vision-based Roadside 3D Object Detection with Scene Cues

Xiahan Chen, Mingjian Chen, Sanli Tang, Yi Niu, Jiang Zhu

PDF

Open Access

TL;DR

This paper introduces MOSE, a novel monocular 3D object detection framework leveraging scene cues and a transformer decoder to improve roadside autonomous driving perception, achieving state-of-the-art results.

Contribution

The paper proposes a scene cue bank and a transformer-based decoder to enhance 3D object detection from roadside cameras, addressing inter-frame consistency and scene invariance.

Findings

01

Surpasses existing methods on public benchmarks

02

Achieves significant performance improvements

03

Demonstrates robustness across diverse scenes

Abstract

3D object detection based on roadside cameras is an additional way for autonomous driving to alleviate the challenges of occlusion and short perception range from vehicle cameras. Previous methods for roadside 3D object detection mainly focus on modeling the depth or height of objects, neglecting the stationary of cameras and the characteristic of inter-frame consistency. In this work, we propose a novel framework, namely MOSE, for MOnocular 3D object detection with Scene cuEs. The scene cues are the frame-invariant scene-specific features, which are crucial for object localization and can be intuitively regarded as the height between the surface of the real road and the virtual ground plane. In the proposed framework, a scene cue bank is designed to aggregate scene cues from multiple frames of the same scene with a carefully designed extrinsic augmentation strategy. Then, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Autonomous Vehicle Technology and Safety · Video Surveillance and Tracking Methods

MethodsFocus