Autogenic Language Embedding for Coherent Point Tracking

Zikai Song; Ying Tang; Run Luo; Lintao Ma; Junqing Yu; Yi-Ping Phoebe; Chen; Wei Yang

arXiv:2407.20730·cs.CV·July 31, 2024·1 cites

Autogenic Language Embedding for Coherent Point Tracking

Zikai Song, Ying Tang, Run Luo, Lintao Ma, Junqing Yu, Yi-Ping Phoebe, Chen, Wei Yang

PDF

Open Access 1 Repo

TL;DR

This paper presents a novel visual tracking method that uses language embeddings to improve point correspondence and coherence in long video sequences, outperforming traditional visual-only approaches.

Contribution

Introduces autogenic language embedding for visual feature enhancement, learning text embeddings from visual data without explicit annotations, improving long-term point tracking.

Findings

01

Significantly improves tracking accuracy on benchmark datasets.

02

Enhances visual feature consistency with minimal computational overhead.

03

Outperforms existing visual-only tracking methods.

Abstract

Point tracking is a challenging task in computer vision, aiming to establish point-wise correspondence across long video sequences. Recent advancements have primarily focused on temporal modeling techniques to improve local feature similarity, often overlooking the valuable semantic consistency inherent in tracked points. In this paper, we introduce a novel approach leveraging language embeddings to enhance the coherence of frame-wise visual features related to the same object. Our proposed method, termed autogenic language embedding for visual feature enhancement, strengthens point correspondence in long-term sequences. Unlike existing visual-language schemes, our approach learns text embeddings from visual features through a dedicated mapping network, enabling seamless adaptation to various tracking tasks without explicit text annotations. Additionally, we introduce a consistency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

skyesong38/altrack
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling