TL;DR
CAMOT introduces a camera angle estimator that improves multi-object tracking by addressing occlusion and depth inaccuracies, achieving state-of-the-art results with lower computational cost.
Contribution
It presents a simple, effective camera angle estimator that enhances multi-object tracking accuracy by providing pseudo-3D information without heavy computation.
Findings
Achieved state-of-the-art HOTA, MOTA, and IDF1 scores on MOT17.
Effectively reduces occlusion and depth estimation errors.
Lower computational cost compared to existing depth estimators.
Abstract
This paper proposes CAMOT, a simple camera angle estimator for multi-object tracking to tackle two problems: 1) occlusion and 2) inaccurate distance estimation in the depth direction. Under the assumption that multiple objects are located on a flat plane in each video frame, CAMOT estimates the camera angle using object detection. In addition, it gives the depth of each object, enabling pseudo-3D MOT. We evaluated its performance by adding it to various 2D MOT methods on the MOT17 and MOT20 datasets and confirmed its effectiveness. Applying CAMOT to ByteTrack, we obtained 63.8% HOTA, 80.6% MOTA, and 78.5% IDF1 in MOT17, which are state-of-the-art results. Its computational cost is significantly lower than the existing deep-learning-based depth estimators for tracking.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
CAMOT: Camera Angle-Aware Multi-Object Tracking· youtube
