Underwater target detection based on improved YOLOv7
Kaiyue Liu, Qi Sun, Daming Sun, Mengduo Yang, Nizhuan Wang

TL;DR
This paper introduces an improved YOLOv7-based network with enhanced feature extraction and accuracy for underwater target detection, demonstrating superior performance on multiple datasets and real-time capability.
Contribution
The study proposes novel modules and mechanisms integrated into YOLOv7, significantly improving underwater detection accuracy and speed over existing methods.
Findings
Achieved 89.6% mAP on URPC dataset
Achieved 97.4% mAP on Brackish dataset
Outperformed original YOLOv7 in detection speed and accuracy
Abstract
Underwater target detection is a crucial aspect of ocean exploration. However, conventional underwater target detection methods face several challenges such as inaccurate feature extraction, slow detection speed and lack of robustness in complex underwater environments. To address these limitations, this study proposes an improved YOLOv7 network (YOLOv7-AC) for underwater target detection. The proposed network utilizes an ACmixBlock module to replace the 3x3 convolution block in the E-ELAN structure, and incorporates jump connections and 1x1 convolution architecture between ACmixBlock modules to improve feature extraction and network reasoning speed. Additionally, a ResNet-ACmix module is designed to avoid feature information loss and reduce computation, while a Global Attention Mechanism (GAM) is inserted in the backbone and head parts of the model to improve feature extraction.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image Enhancement Techniques · Underwater Acoustics Research
Methods1x1 Convolution · Convolution · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
