TrajLoom: Dense Future Trajectory Generation from Video

Zewei Zhang; Jia Jun Cheng Xian; Kaiwen Liu; Ming Liang; Hang Chu; Jun Chen; and Renjie Liao

arXiv:2603.22606·cs.CV·March 25, 2026

TrajLoom: Dense Future Trajectory Generation from Video

Zewei Zhang, Jia Jun Cheng Xian, Kaiwen Liu, Ming Liang, Hang Chu, Jun Chen, and Renjie Liao

PDF

Open Access 1 Models 1 Datasets

TL;DR

TrajLoom introduces a novel framework for dense future trajectory prediction from videos, significantly extending prediction horizons and enhancing motion realism, thereby advancing controllable video generation and editing.

Contribution

It presents a new method combining grid-anchor encoding, a variational autoencoder, and flow matching, along with a unified benchmark for future trajectory prediction.

Findings

01

Extends prediction horizon from 24 to 81 frames.

02

Improves motion realism and stability across datasets.

03

Supports downstream video generation and editing.

Abstract

Predicting future motion is crucial in video understanding and controllable video generation. Dense point trajectories are a compact, expressive motion representation, but modeling their future evolution from observed video remains challenging. We propose a framework that predicts future trajectories and visibility from past trajectories and video context. Our method has three components: (1) Grid-Anchor Offset Encoding, which reduces location-dependent bias by representing each point as an offset from its pixel-center anchor; (2) TrajLoom-VAE, which learns a compact spatiotemporal latent space for dense trajectories with masked reconstruction and a spatiotemporal consistency regularizer; and (3) TrajLoom-Flow, which generates future trajectories in latent space via flow matching, with boundary cues and on-policy K-step fine-tuning for stable sampling. We also introduce TrajLoomBench, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
zeweizhang/TrajLoom
model

Datasets

zeweizhang/TrajLoomDatasets
dataset· 42 dl
42 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Human Motion and Animation · Human Pose and Action Recognition