Efficient Vision-based Vehicle Speed Estimation
Andrej Macko, Luk\'a\v{s} Gajdo\v{s}ech, Viktor Kocur

TL;DR
This paper introduces a computationally efficient vehicle speed estimation method from traffic camera footage that achieves real-time performance and improved accuracy, especially on edge devices, by optimizing models and leveraging vanishing point geometry.
Contribution
The paper presents novel improvements for real-time vehicle speed estimation from traffic videos, achieving higher speed and detection accuracy with significantly increased computational efficiency.
Findings
Achieves median speed estimation error of 0.58 km/h, outperforming previous methods.
Attains detection precision of 91.02% and recall of 91.14%, surpassing prior benchmarks.
Runs 5.5 times faster than previous state-of-the-art methods.
Abstract
This paper presents a computationally efficient method for vehicle speed estimation from traffic camera footage. Building upon previous work that utilizes 3D bounding boxes derived from 2D detections and vanishing point geometry, we introduce several improvements to enhance real-time performance. We evaluate our method in several variants on the BrnoCompSpeed dataset in terms of vehicle detection and speed estimation accuracy. Our extensive evaluation across various hardware platforms, including edge devices, demonstrates significant gains in frames per second (FPS) compared to the prior state-of-the-art, while maintaining comparable or improved speed estimation accuracy. We analyze the trade-off between accuracy and computational cost, showing that smaller models utilizing post-training quantization offer the best balance for real-world deployment. Our best performing model beats…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
