Goodbye Drift: Anchored Tree Sampling for Long-Horizon Video-to-Video Generation
Matthew Bendel, Stephen W. Bailey, Mithilesh Vaidya, Sumukh Badam, Xingzhe He

TL;DR
This paper introduces Anchored Tree Sampling (ATS), a novel inference-time scheduler for long-horizon video generation that reduces drift and improves quality without retraining, demonstrated across multiple modalities and long durations.
Contribution
ATS is a training-free, inference-time method that organizes sparse-to-dense anchor imputation as a tree, significantly reducing drift and enhancing long-horizon video generation quality.
Findings
ATS outperforms autoregressive baselines in quality and drift prevention.
Achieves stable 40-minute video generation across five modalities.
Reduces critical path from K steps to L+1 steps, improving efficiency.
Abstract
Long-horizon video generation suffers from two intertwined issues. First, there is drift, where video quality degrades over time. Second, there are continuity issues which manifest as object permanence issues, or improperly rendering transient content (e.g., an object that appears in non-consecutive frames changing color/style). Recent work has focused on autoregressive distillation techniques that attack both problems simultaneously. We instead choose to focus on drift directly and introduce \textbf{Anchored Tree Sampling (ATS)}: a training-free inference-time scheduler that replaces left-to-right rollout with sparse-to-dense, anchor-bounded imputation organized as a tree. A root call produces sparse anchors over the full horizon, recursive refinement generates intermediate anchors, and final leaf spans are synthesized between neighboring anchors. This reduces the critical path from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
