Track Anything Rapter(TAR)

Tharun V. Puthanveettil; Fnu Obaid ur Rahman

arXiv:2405.11655·cs.CV·May 30, 2024

Track Anything Rapter(TAR)

Tharun V. Puthanveettil, Fnu Obaid ur Rahman

PDF

Open Access 1 Repo

TL;DR

This paper presents TAR, an advanced UAV tracking system that integrates pre-trained models and multimodal queries for precise object tracking in various scenarios, validated against ground truth and tested with multiple modalities.

Contribution

Develops TAR, a novel UAV tracking system combining pre-trained models and multimodal queries for improved object detection and tracking.

Findings

01

TAR achieves stable and precise tracking on a custom drone.

02

The system effectively handles occlusions using foundational models.

03

Multi-modality support enhances tracking versatility.

Abstract

Object tracking is a fundamental task in computer vision with broad practical applications across various domains, including traffic monitoring, robotics, and autonomous vehicle tracking. In this project, we aim to develop a sophisticated aerial vehicle system known as Track Anything Rapter (TAR), designed to detect, segment, and track objects of interest based on user-provided multimodal queries, such as text, images, and clicks. TAR utilizes cutting-edge pre-trained models like DINO, CLIP, and SAM to estimate the relative pose of the queried object. The tracking problem is approached as a Visual Servoing task, enabling the UAV to consistently focus on the object through advanced motion planning and control algorithms. We showcase how the integration of these foundational models with a custom high-level control algorithm results in a highly stable and precise tracking system deployed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tvpian/project-tar
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · UAV Applications and Optimization · Advanced Neural Network Applications

MethodsAttention Is All You Need · Linear Layer · Softmax · Layer Normalization · Multi-Head Attention · Dense Connections · Residual Connection · Vision Transformer · Focus · Contrastive Language-Image Pre-training