Hierarchical Vector-Quantized Latents for Perceptual Low-Resolution Video Compression

Manikanta Kotthapalli; Banafsheh Rekabdar

arXiv:2512.24547·cs.CV·January 1, 2026

Hierarchical Vector-Quantized Latents for Perceptual Low-Resolution Video Compression

Manikanta Kotthapalli, Banafsheh Rekabdar

PDF

Open Access

TL;DR

This paper introduces a hierarchical vector-quantized autoencoder for low-resolution video compression, enabling efficient storage and transmission with high perceptual quality suitable for edge devices.

Contribution

It extends VQ-VAE-2 to a spatiotemporal setting with a hierarchical latent structure, optimized for low-res video compression on resource-constrained devices.

Findings

01

Achieves 25.96 dB PSNR and 0.8375 SSIM on UCF101

02

Improves over baseline by 1.41 dB PSNR

03

Lightweight model with 18.5M parameters

Abstract

The exponential growth of video traffic has placed increasing demands on bandwidth and storage infrastructure, particularly for content delivery networks (CDNs) and edge devices. While traditional video codecs like H.264 and HEVC achieve high compression ratios, they are designed primarily for pixel-domain reconstruction and lack native support for machine learning-centric latent representations, limiting their integration into deep learning pipelines. In this work, we present a Multi-Scale Vector Quantized Variational Autoencoder (MS-VQ-VAE) designed to generate compact, high-fidelity latent representations of low-resolution video, suitable for efficient storage, transmission, and client-side decoding. Our architecture extends the VQ-VAE-2 framework to a spatiotemporal setting, introducing a two-level hierarchical latent structure built with 3D residual convolutions. The model is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Video Coding and Compression Technologies · Image and Video Quality Assessment