Graph-Enhanced Dual-Stream Feature Fusion with Pre-Trained Model for Acoustic Traffic Monitoring
Shitong Fan, Feiyang Xiao, Wenbo Wang, Shuhan Qi, Qiaoxi Zhu, Wenwu, Wang, Jian Guan

TL;DR
This paper introduces GEDF-Net, a graph-enhanced dual-stream neural network that leverages pre-trained models and graph attention to improve vehicle detection and classification in acoustic traffic monitoring, winning first place in a challenge.
Contribution
The paper proposes a novel graph-enhanced dual-stream feature fusion strategy combined with pre-trained models for improved acoustic traffic monitoring performance.
Findings
Achieved 1st place in DCASE 2024 Challenge Task 10
Enhanced vehicle type and direction detection accuracy
Effectively mitigated data scarcity with pre-trained models
Abstract
Microphone array techniques are widely used in sound source localization and smart city acoustic-based traffic monitoring, but these applications face significant challenges due to the scarcity of labeled real-world traffic audio data and the complexity and diversity of application scenarios. The DCASE Challenge's Task 10 focuses on using multi-channel audio signals to count vehicles (cars or commercial vehicles) and identify their directions (left-to-right or vice versa). In this paper, we propose a graph-enhanced dual-stream feature fusion network (GEDF-Net) for acoustic traffic monitoring, which simultaneously considers vehicle type and direction to improve detection. We propose a graph-enhanced dual-stream feature fusion strategy which consists of a vehicle type feature extraction (VTFE) branch, a vehicle direction feature extraction (VDFE) branch, and a frame-level feature fusion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Traffic Prediction and Management Techniques
MethodsSoftmax · Attention Is All You Need
