Mind the Hitch: Dynamic Calibration and Articulated Perception for Autonomous Trucks
Morui Zhu, Yongqi Zhu, Song Fu, Qing Yang

TL;DR
This paper introduces dCAP, a vision-based framework for continuous calibration and perception in autonomous trucks, addressing articulated geometry and sensor pose variations with a transformer-based approach.
Contribution
The work presents a novel dynamic calibration method using transformers, integrated with BEVFormer, and introduces a new benchmark dataset for autonomous trucking perception.
Findings
dCAP achieves stable, accurate perception under articulation and occlusion.
Replacing static calibration with dynamic extrinsics improves 3D detection.
The STT4AT benchmark enables evaluation of perception in realistic semi-trailer scenarios.
Abstract
Autonomous trucking poses unique challenges due to articulated tractor-trailer geometry, and time-varying sensor poses caused by the fifth-wheel joint and trailer flex. Existing perception and calibration methods assume static baselines or rely on high-parallax and texture-rich scenes, limiting their reliability under real-world settings. We propose dCAP (dynamic Calibration and Articulated Perception), a vision-based framework that continuously estimates the 6-DoF (degree of freedom) relative pose between tractor and trailer cameras. dCAP employs a transformer with cross-view and temporal attention to robustly aggregate spatial cues while maintaining temporal consistency, enabling accurate perception under rapid articulation and occlusion. Integrated with BEVFormer, dCAP improves 3D object detection by replacing static calibration with dynamically predicted extrinsics. To facilitate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
