DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic   Latent Particles

Tal Daniel; Aviv Tamar

arXiv:2306.05957·cs.CV·February 9, 2024·1 cites

DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic Latent Particles

Tal Daniel, Aviv Tamar

PDF

Open Access 1 Repo

TL;DR

DDLP introduces an efficient, interpretable object-centric video prediction method using dynamic latent particles, enabling state-of-the-art results and flexible "what-if" scenario generation in videos.

Contribution

The paper presents a novel deep latent particle representation for object-centric video prediction, improving interpretability and efficiency over existing methods.

Findings

01

Achieved state-of-the-art object-centric video prediction results.

02

Enabled flexible "what-if" scenario generation.

03

Demonstrated efficient diffusion-based unconditional video generation.

Abstract

We propose a new object-centric video prediction algorithm based on the deep latent particle (DLP) representation. In comparison to existing slot- or patch-based representations, DLPs model the scene using a set of keypoints with learned parameters for properties such as position and size, and are both efficient and interpretable. Our method, deep dynamic latent particles (DDLP), yields state-of-the-art object-centric video prediction results on several challenging datasets. The interpretable nature of DDLP allows us to perform ``what-if'' generation -- predict the consequence of changing properties of objects in the initial frames, and DLP's compact structure enables efficient diffusion-based unconditional video generation. Videos, code and pre-trained models are available: https://taldatech.github.io/ddlp-web

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

taldatech/ddlp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Cancer-related molecular mechanisms research