Integrating Vehicle Acoustic Data for Enhanced Urban Traffic Management: A Study on Speed Classification in Suzhou
Pengfei Fan, Yuli Zhang, Xinheng Wang, Ruiyuan Jiang, Hankang Gu, Dongyao Jia, Shangbo Wang

TL;DR
This paper introduces a new acoustic dataset and a deep learning model for classifying vehicle speeds in urban traffic, demonstrating high accuracy and potential for real-time traffic management and noise pollution reduction.
Contribution
The study provides a publicly available acoustic dataset and proposes a bimodal deep CNN with attention mechanisms for vehicle speed classification, improving accuracy and robustness.
Findings
Achieved 87.56% accuracy on SZUR-Acoustic Dataset
Achieved 96.28% accuracy on IDMT-Traffic dataset
Validated effectiveness of the bimodal fusion and attention modules
Abstract
This study presents and publicly releases the Suzhou Urban Road Acoustic Dataset (SZUR-Acoustic Dataset), which is accompanied by comprehensive data-acquisition protocols and annotation guidelines to ensure transparency and reproducibility of the experimental workflow. To model the coupling between vehicular noise and driving speed, we propose a bimodal-feature-fusion deep convolutional neural network (BMCNN). During preprocessing, an adaptive denoising and normalization strategy is applied to suppress environmental background interference; in the network architecture, parallel branches extract Mel-frequency cepstral coefficients (MFCCs) and wavelet-packet energy features, which are subsequently fused via a cross-modal attention mechanism in the intermediate feature space to fully exploit time-frequency information. Experimental results demonstrate that BMCNN achieves a classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Music and Audio Processing · Vehicle Noise and Vibration Control
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
