Decomposition Betters Tracking Everything Everywhere

Rui Li; Dong Liu

arXiv:2407.06531·cs.CV·July 17, 2024

Decomposition Betters Tracking Everything Everywhere

Rui Li, Dong Liu

PDF

Open Access 1 Repo

TL;DR

DecoMotion introduces a novel test-time optimization approach that decomposes videos into static and dynamic components for improved pixel-level motion estimation, robustness, and appearance decomposition.

Contribution

It proposes DecoMotion, a new method that explicitly decomposes video content into static and dynamic volumes for better long-range and occlusion-robust motion tracking.

Findings

01

Significantly improves point-tracking accuracy on TAP-Vid benchmark.

02

Performs comparably to state-of-the-art point-tracking methods.

03

Effectively handles occlusions and non-rigid deformations.

Abstract

Recent studies on motion estimation have advocated an optimized motion representation that is globally consistent across the entire video, preferably for every pixel. This is challenging as a uniform representation may not account for the complex and diverse motion and appearance of natural videos. We address this problem and propose a new test-time optimization method, named DecoMotion, for estimating per-pixel and long-range motion. DecoMotion explicitly decomposes video content into static scenes and dynamic objects, either of which uses a quasi-3D canonical volume to represent. DecoMotion separately coordinates the transformations between local and canonical spaces, facilitating an affine transformation for the static scene that corresponds to camera motion. For the dynamic volume, DecoMotion leverages discriminative and temporally consistent features to rectify the non-rigid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qianduoduolr/decomotion
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Vision and Imaging