Adaptive 1D Video Diffusion Autoencoder
Yao Teng, Minxuan Lin, Xian Liu, Shuai Wang, Xiao Yang, Xihui Liu

TL;DR
This paper introduces One-DVA, a transformer-based autoencoder for video that adaptively compresses and reconstructs videos using diffusion models, overcoming fixed-rate limitations and enhancing generative capabilities.
Contribution
The paper presents a novel adaptive 1D video autoencoder with transformer-based encoding and diffusion decoding, enabling variable-length compression and improved generative support.
Findings
Achieves comparable reconstruction quality to 3D-CNN VAEs at similar compression ratios.
Supports adaptive compression for higher ratios.
Regularizes latent distribution for better generative modeling.
Abstract
Recent video generation models largely rely on video autoencoders that compress pixel-space videos into latent representations. However, existing video autoencoders suffer from three major limitations: (1) fixed-rate compression that wastes tokens on simple videos, (2) inflexible CNN architectures that prevent variable-length latent modeling, and (3) deterministic decoders that struggle to recover appropriate details from compressed latents. To address these issues, we propose One-Dimensional Diffusion Video Autoencoder (One-DVA), a transformer-based framework for adaptive 1D encoding and diffusion-based decoding. The encoder employs query-based vision transformers to extract spatiotemporal features and produce latent representations, while a variable-length dropout mechanism dynamically adjusts the latent length. The decoder is a pixel-space diffusion transformer that reconstructs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Advanced Data Compression Techniques
