DARTer: Dynamic Adaptive Representation Tracker for Nighttime UAV Tracking

Xuzhao Li; Xuchen Li; Shiyu Hu

arXiv:2505.00752·cs.CV·May 19, 2025

DARTer: Dynamic Adaptive Representation Tracker for Nighttime UAV Tracking

Xuzhao Li, Xuchen Li, Shiyu Hu

PDF

Open Access

TL;DR

DARTer is an end-to-end nighttime UAV tracking framework that adaptively fuses multi-perspective features and activates transformer layers for improved robustness and efficiency in challenging illumination and viewpoint conditions.

Contribution

The paper introduces DARTer, a novel tracker that employs dynamic feature fusion and adaptive transformer activation, reducing computational costs while enhancing tracking performance in nighttime UAV scenarios.

Findings

01

Outperforms state-of-the-art trackers on multiple benchmarks.

02

Effectively balances tracking accuracy and computational efficiency.

03

Eliminates complex multi-task loss functions for streamlined training.

Abstract

Nighttime UAV tracking presents significant challenges due to extreme illumination variations and viewpoint changes, which severely degrade tracking performance. Existing approaches either rely on light enhancers with high computational costs or introduce redundant domain adaptation mechanisms, failing to fully utilize the dynamic features in varying perspectives. To address these issues, we propose \textbf{DARTer} (\textbf{D}ynamic \textbf{A}daptive \textbf{R}epresentation \textbf{T}racker), an end-to-end tracking framework designed for nighttime UAV scenarios. DARTer leverages a Dynamic Feature Blender (DFB) to effectively fuse multi-perspective nighttime features from static and dynamic templates, enhancing representation robustness. Meanwhile, a Dynamic Feature Activator (DFA) adaptively activates Vision Transformer layers based on extracted features, significantly improving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · Robotics and Sensor-Based Localization

MethodsAttention Is All You Need · RoIAlign · Linear Layer · Multi-Head Attention · Dense Connections · Layer Normalization · RoIPool · Byte Pair Encoding · Label Smoothing · Adam