UETrack: A Unified and Efficient Framework for Single Object Tracking
Ben Kang, Jie Zhao, Xin Chen, Wanting Geng, Bin Zhang, Lu Zhang, Dong Wang, Huchuan Lu

TL;DR
UETrack is a versatile, efficient single object tracking framework capable of handling multiple modalities with a novel mixture-of-experts and adaptive distillation, achieving high accuracy and speed on various benchmarks.
Contribution
It introduces a unified framework that efficiently supports multi-modal inputs and employs innovative mechanisms for improved performance and resource efficiency.
Findings
Achieves 69.2% AUC on LaSOT benchmark.
Runs at 163/56/60 FPS on GPU/CPU/AGX hardware.
Outperforms previous methods in speed-accuracy trade-off.
Abstract
With growing real-world demands, efficient tracking has received increasing attention. However, most existing methods are limited to RGB inputs and struggle in multi-modal scenarios. Moreover, current multi-modal tracking approaches typically use complex designs, making them too heavy and slow for resource-constrained deployment. To tackle these limitations, we propose UETrack, an efficient framework for single object tracking. UETrack demonstrates high practicality and versatility, efficiently handling multiple modalities including RGB, Depth, Thermal, Event, and Language, and addresses the gap in efficient multi-modal tracking. It introduces two key components: a Token-Pooling-based Mixture-of-Experts mechanism that enhances modeling capacity through feature aggregation and expert specialization, and a Target-aware Adaptive Distillation strategy that selectively performs distillation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Gaze Tracking and Assistive Technology · Human Pose and Action Recognition
