DiV-INR: Extreme Low-Bitrate Diffusion Video Compression with INR Conditioning

Eren \c{C}etin; Lucas Relic; Yuanyi Xue; Markus Gross; Christopher Schroers; Roberto Azevedo

arXiv:2604.08329·eess.IV·April 10, 2026

DiV-INR: Extreme Low-Bitrate Diffusion Video Compression with INR Conditioning

Eren \c{C}etin, Lucas Relic, Yuanyi Xue, Markus Gross, Christopher Schroers, Roberto Azevedo

PDF

TL;DR

This paper introduces a novel video compression method combining implicit neural representations and diffusion models to achieve high perceptual quality at extremely low bitrates, outperforming existing codecs.

Contribution

It proposes a joint optimization framework for INR and diffusion models, enabling efficient, perceptually-driven video compression with minimal parameter overhead.

Findings

01

Significant improvements in LPIPS, DISTS, and FID metrics at <0.05 bpp.

02

Outperforms HEVC, VVC, and previous neural codecs in perceptual quality.

03

Reveals a semantic-to-visual hierarchy in scene representation.

Abstract

We present a perceptually-driven video compression framework integrating implicit neural representations (INRs) and pre-trained video diffusion models to address the extremely low bitrate regime (<0.05 bpp). Our approach exploits the complementary strengths of INRs, which provide a compact video representation, and diffusion models, which offer rich generative priors learned from large-scale datasets. The INR-based conditioning replaces traditional intra-coded keyframes with bit-efficient neural representations trained to estimate latent features and guide the diffusion process. Our joint optimization of INR weights and parameter-efficient adapters for diffusion models allows the model to learn reliable conditioning signals while encoding video-specific information with minimal parameter overhead. Our experiments on UVG, MCL-JCV, and JVET Class-B benchmarks demonstrate substantial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.