Counting Fish with Temporal Representations of Sonar Video
Kai Van Brunt, Justin Kay, Timm Haucke, Pietro Perona, Grant Van Horn,, Sara Beery

TL;DR
This paper introduces a lightweight, domain-specific computer vision method using echograms and ResNet-18 to accurately count salmon in sonar videos, suitable for deployment in resource-limited field conditions.
Contribution
It presents a novel approach that leverages temporal echogram representations and weakly-supervised training for fish counting, reducing computational requirements compared to existing methods.
Findings
Achieved 23% count error on real-world data
Demonstrated feasibility of lightweight counting model in field conditions
Improved counting accuracy with domain-specific augmentations
Abstract
Accurate estimates of salmon escapement - the number of fish migrating upstream to spawn - are key data for conservation and fishery management. Existing methods for salmon counting using high-resolution imaging sonar hardware are non-invasive and compatible with computer vision processing. Prior work in this area has utilized object detection and tracking based methods for automated salmon counting. However, these techniques remain inaccessible to many sonar deployment sites due to limited compute and connectivity in the field. We propose an alternative lightweight computer vision method for fish counting based on analyzing echograms - temporal representations that compress several hundred frames of imaging sonar video into a single image. We predict upstream and downstream counts within 200-frame time windows directly from echograms using a ResNet-18 model, and propose a set of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques · Music and Audio Processing
MethodsSparse Evolutionary Training
