Graph Embedding with Mel-spectrograms for Underwater Acoustic Target Recognition
Sheng Feng, Shuqing Ma, Xiaoqian Zhu

TL;DR
This paper introduces UATR-GTransformer, a novel non-Euclidean deep learning model combining Transformers and graph neural networks to improve underwater acoustic target recognition by capturing complex signal topology.
Contribution
It proposes a new graph-based deep learning architecture that effectively models the non-Euclidean structure of underwater acoustic signals, outperforming existing methods.
Findings
Achieves competitive performance on benchmark datasets
Effectively captures frequency-domain information
Demonstrates potential for ocean engineering applications
Abstract
Underwater acoustic target recognition (UATR) is extremely challenging due to the complexity of ship-radiated noise and the variability of ocean environments. Although deep learning (DL) approaches have achieved promising results, most existing models implicitly assume that underwater acoustic data lie in a Euclidean space. This assumption, however, is unsuitable for the inherently complex topology of underwater acoustic signals, which exhibit non-stationary, non-Gaussian, and nonlinear characteristics. To overcome this limitation, this paper proposes the UATR-GTransformer, a non-Euclidean DL model that integrates Transformer architectures with graph neural networks (GNNs). The model comprises three key components: a Mel patchify block, a GTransformer block, and a classification head. The Mel patchify block partitions the Mel-spectrogram into overlapping patches, while the GTransformer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUnderwater Acoustics Research · Underwater Vehicles and Communication Systems · Advanced SAR Imaging Techniques
