Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance

Liting Lin; Heng Fan; Zhipeng Zhang; Yaowei Wang; Yong Xu; Haibin Ling

arXiv:2403.05231·cs.CV·July 29, 2024·1 cites

Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance

Liting Lin, Heng Fan, Zhipeng Zhang, Yaowei Wang, Yong Xu, Haibin Ling

PDF

Open Access 1 Repo

TL;DR

This paper introduces LoRAT, a parameter-efficient fine-tuning method for large vision transformers in tracking, achieving faster training, higher performance, and real-time inference on limited hardware.

Contribution

We adapt LoRA for visual tracking with transformer models, addressing domain gaps and designing a new shared-independent position embedding scheme and an MLP-based head.

Findings

01

Reduced training time from 35.0 to 10.8 GPU hours

02

Improved LaSOT SUC score from 0.703 to 0.742

03

Increased inference speed from 52 to 119 FPS

Abstract

Motivated by the Parameter-Efficient Fine-Tuning (PEFT) in large language models, we propose LoRAT, a method that unveils the power of large ViT model for tracking within laboratory-level resources. The essence of our work lies in adapting LoRA, a technique that fine-tunes a small subset of model parameters without adding inference latency, to the domain of visual tracking. However, unique challenges and potential domain gaps make this transfer not as easy as the first intuition. Firstly, a transformer-based tracker constructs unshared position embedding for template and search image. This poses a challenge for the transfer of LoRA, usually requiring consistency in the design when applied to the pre-trained backbone, to downstream tasks. Secondly, the inductive bias inherent in convolutional heads diminishes the effectiveness of parameter-efficient fine-tuning in tracking models. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

litinglin/lorat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Robotics and Automated Systems