TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video   Synthesis

Menghao Li; Zhenghao Zhang; Junchao Liao; Long Qin; Weizhi Wang

arXiv:2502.19454·cs.GR·March 4, 2025

TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video Synthesis

Menghao Li, Zhenghao Zhang, Junchao Liao, Long Qin, Weizhi Wang

PDF

Open Access

TL;DR

TransVDM is a novel diffusion-based model that generates transparent videos by integrating a specialized autoencoder, a motion constraint module, and a large transparent video dataset, advancing the capabilities of video synthesis.

Contribution

We introduce TransVDM, the first diffusion model tailored for transparent video generation, combining a new autoencoder, motion constraints, and a large transparent video dataset.

Findings

01

TransVDM effectively generates high-quality transparent videos.

02

The model reduces artifacts in transparent regions.

03

Experimental results outperform existing methods on benchmarks.

Abstract

Recent developments in Video Diffusion Models (VDMs) have demonstrated remarkable capability to generate high-quality video content. Nonetheless, the potential of VDMs for creating transparent videos remains largely uncharted. In this paper, we introduce TransVDM, the first diffusion-based model specifically designed for transparent video generation. TransVDM integrates a Transparent Variational Autoencoder (TVAE) and a pretrained UNet-based VDM, along with a novel Alpha Motion Constraint Module (AMCM). The TVAE captures the alpha channel transparency of video frames and encodes it into the latent space of the VDMs, facilitating a seamless transition to transparent video diffusion models. To improve the detection of transparent areas, the AMCM integrates motion constraints from the foreground within the VDM, helping to reduce undesirable artifacts. Moreover, we curate a dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Video Coding and Compression Technologies · Image and Video Quality Assessment