# Physics-Informed Side-Scan Sonar Perception: Tackling Weak Targets and Sparse Debris via Geometric and Frequency Decoupling

**Authors:** Bojian Yu, Rongsheng Lin, Hanxiang Zhou, Jianxiong Zhang, Xinwei Zhang

PMC · DOI: 10.3390/s26061938 · Sensors (Basel, Switzerland) · 2026-03-19

## TL;DR

This paper introduces WPG-DetNet, a physics-informed sonar perception system that improves detection of weak targets and sparse debris in underwater search and rescue.

## Contribution

The novel framework combines wavelet-based frequency decoupling, debris graph reasoning, and physics-informed geometric loss for sonar perception.

## Key findings

- WPG-DetNet achieves 97.5% mean Average Precision (mAP50) and 96.9% Recall on the SCTD dataset.
- The model outperforms Faster R-CNN by 12.8% in mAP50 and RT-DETR-R18 by 5.6% in localization metrics.
- The system maintains 62.5 FPS inference speed with 16.8 M parameters, balancing real-time performance and accuracy.

## Abstract

Side-scan sonar (SSS) serves as the primary perceptual instrument for Autonomous Underwater Vehicles (AUVs) in large-scale marine search and rescue (SAR) operations. However, the detection of critical targets is frequently hindered by severe hydro-acoustic noise, the spatial discontinuity of wreckage, and the weak visual signatures of small targets. To surmount these challenges, this paper presents WPG-DetNet. First, we introduce a Wavelet-Embedded Residual Backbone (WERB) to reconstruct the conventional downsampling paradigm. By substituting standard pooling with the Discrete Wavelet Transform (DWT), this architecture explicitly disentangles high-frequency noise from structural information in the frequency domain, thereby achieving the adaptive preservation of edge fidelity for large human-made targets while filtering out speckle interference. Then, addressing the distinct challenge of discontinuous aircraft wreckage, the framework further incorporates a Debris Graph Reasoning Module (D-GRM). This module models scattered fragments as nodes in a topological graph to capture long-range semantic dependencies, transforming isolated instance recognition into context-aware scene understanding. Finally, to bridge the gap between AI and underwater physics, we design a Shadow-Aided Decoupling Head (SADH) equipped with a physics-informed geometric loss. By enforcing mathematical consistency between target height and acoustic shadow length, this mechanism establishes a rigorous discriminative criterion capable of distinguishing weak-echo human bodies from seabed rocks based on shadow geometry. Experiments on the SCTD dataset demonstrate that WPG-DetNet achieves a mean Average Precision (mAP50) of 97.5% and a Recall of 96.9%. Quantitative analysis reveals that our framework outperforms the classic Faster R-CNN by a margin of 12.8% in mAP50 and surpasses the Transformer-based RT-DETR-R18 by 5.6% in high-precision localization metrics (mAP50:95). Simultaneously, WPG-DetNet maintains superior efficiency with an inference speed of 62.5 FPS and a lightweight parameter count of 16.8 M, striking an optimal balance between robust perception and the real-time constraints of AUV operations.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13029880/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13029880/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC13029880/full.md

---
Source: https://tomesphere.com/paper/PMC13029880