TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation

Zonglin Lyu; Chen Chen

arXiv:2507.04984·cs.CV·July 8, 2025

TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation

Zonglin Lyu, Chen Chen

PDF

1 Models

TL;DR

TLB-VFI introduces an efficient, temporal-aware diffusion model for video frame interpolation that significantly improves quality, reduces parameters, and speeds up inference by leveraging novel temporal encoding techniques.

Contribution

The paper proposes TLB-VFI, a novel temporal-aware latent diffusion model that enhances video frame interpolation efficiency and performance with fewer parameters and training data.

Findings

01

Achieves 20% better FID on challenging datasets.

02

Uses 3x fewer parameters and 2.3x faster inference.

03

Requires 9000x less training data with optical flow guidance.

Abstract

Video Frame Interpolation (VFI) aims to predict the intermediate frame $I_{n}$ (we use n to denote time in videos to avoid notation overload with the timestep $t$ in diffusion models) based on two consecutive neighboring frames $I_{0}$ and $I_{1}$ . Recent approaches apply diffusion models (both image-based and video-based) in this task and achieve strong performance. However, image-based diffusion models are unable to extract temporal information and are relatively inefficient compared to non-diffusion methods. Video-based diffusion models can extract temporal information, but they are too large in terms of training scale, model size, and inference time. To mitigate the above issues, we propose Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation (TLB-VFI), an efficient video-based diffusion model. By extracting rich temporal information from video inputs through our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
ucfzl/TLBVFI
model· 25 dl· ♡ 3
25 dl♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.