WebUOT-1M: Advancing Deep Underwater Object Tracking with A   Million-Scale Benchmark

Chunhui Zhang; Li Liu; Guanjie Huang; Hao Wen; Xi Zhou; Yanfeng Wang

arXiv:2405.19818·cs.CV·May 31, 2024·2 cites

WebUOT-1M: Advancing Deep Underwater Object Tracking with A Million-Scale Benchmark

Chunhui Zhang, Li Liu, Guanjie Huang, Hao Wen, Xi Zhou, Yanfeng Wang

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

This paper introduces WebUOT-1M, the largest underwater object tracking benchmark with 1.1 million frames, and proposes a novel knowledge distillation framework to improve tracking performance in underwater environments.

Contribution

The paper presents WebUOT-1M, a large-scale underwater tracking dataset, and a new knowledge distillation method to transfer open-air tracking knowledge to underwater models.

Findings

01

WebUOT-1M surpasses previous datasets in scale and diversity.

02

The proposed distillation framework improves underwater tracking accuracy.

03

Evaluation on 30 trackers demonstrates WebUOT-1M's effectiveness as a benchmark.

Abstract

Underwater object tracking (UOT) is a foundational task for identifying and tracing submerged entities in underwater video sequences. However, current UOT datasets suffer from limitations in scale, diversity of target categories and scenarios covered, hindering the training and evaluation of modern tracking algorithms. To bridge this gap, we take the first step and introduce WebUOT-1M, \ie, the largest public UOT benchmark to date, sourced from complex and realistic underwater environments. It comprises 1.1 million frames across 1,500 video clips filtered from 408 target categories, largely surpassing previous UOT datasets, \eg, UVOT400. Through meticulous manual annotation and verification, we provide high-quality bounding boxes for underwater targets. Additionally, WebUOT-1M includes language prompts for video sequences, expanding its application areas, \eg, underwater vision-language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

983632847/awesome-multimodal-object-tracking
pytorchOfficial

Datasets

Voxel51/WebUOT-238-Test
dataset· 386 dl
386 dl

Videos

WebUOT-1M: Advancing Deep Underwater Object Tracking with A Million-Scale Benchmark· slideslive

Taxonomy

TopicsUnderwater Acoustics Research · Target Tracking and Data Fusion in Sensor Networks · Underwater Vehicles and Communication Systems

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections