Mel-spectrogram features for acoustic vehicle detection and speed estimation
Nikola Bulatovic, Slobodan Djukanovic

TL;DR
This paper presents a supervised learning approach using mel-spectrogram features for accurate acoustic vehicle detection and speed estimation from single microphone recordings, achieving promising results in urban environments.
Contribution
It introduces a novel use of mel-spectrogram features for direct vehicle detection and speed estimation without intermediate steps, improving accuracy in real-world scenarios.
Findings
Average speed estimation error of 7.87 km/h
48.7% accuracy in 10 km/h speed classification
91.0% accuracy with one class offset allowed
Abstract
The paper addresses acoustic vehicle detection and speed estimation from single sensor measurements. We predict the vehicle's pass-by instant by minimizing clipped vehicle-to-microphone distance, which is predicted from the mel-spectrogram of input audio, in a supervised learning approach. In addition, mel-spectrogram-based features are used directly for vehicle speed estimation, without introducing any intermediate features. The results show that the proposed features can be used for accurate vehicle detection and speed estimation, with an average error of 7.87 km/h. If we formulate speed estimation as a classification problem, with a 10 km/h discretization interval, the proposed method attains the average accuracy of 48.7% for correct class prediction and 91.0% when an offset of one class is allowed. The proposed method is evaluated on a dataset of 304 urban-environment on-field…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
