CHARM3R: Towards Unseen Camera Height Robust Monocular 3D Detector
Abhinav Kumar, Yuliang Guo, Zhihao Zhang, Xinyu Huang, Liu Ren, Xiaoming Liu

TL;DR
This paper investigates the impact of camera height variations on monocular 3D object detection and introduces CHARM3R, a novel method that enhances robustness to unseen camera heights, significantly improving generalization and achieving state-of-the-art results.
Contribution
The paper systematically analyzes camera height effects on monocular 3D detection and proposes CHARM3R, which averages depth estimates to improve robustness to unseen camera heights.
Findings
Depth estimation is key to performance under height variations.
CHARM3R improves generalization to unseen heights by over 45%.
Achieves state-of-the-art results on the CARLA dataset.
Abstract
Monocular 3D object detectors, while effective on data from one ego camera height, struggle with unseen or out-of-distribution camera heights. Existing methods often rely on Plucker embeddings, image transformations or data augmentation. This paper takes a step towards this understudied problem by first investigating the impact of camera height variations on state-of-the-art (SoTA) Mono3D models. With a systematic analysis on the extended CARLA dataset with multiple camera heights, we observe that depth estimation is a primary factor influencing performance under height variations. We mathematically prove and also empirically observe consistent negative and positive trends in mean depth error of regressed and ground-based depth models, respectively, under camera height changes. To mitigate this, we propose Camera Height Robust Monocular 3D Detector (CHARM3R), which averages both depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
