Hierarchical B-frame Video Coding for Long Group of Pictures
Ivan Kirillov, Denis Parkhomenko, Kirill Chernyshev, Alexander, Pletnev, Yibo Shi, Kai Lin, Dmitry Babin

TL;DR
This paper introduces a hierarchical B-frame video codec optimized for long GOPs that achieves competitive performance with VVC and outperforms existing learned codecs in random access scenarios, using content adaptation and rate allocation.
Contribution
It presents a novel end-to-end learned video codec for random access that combines hierarchical coding, long sequence training, and content adaptation, improving performance over prior methods.
Findings
Achieves VVC-level performance in YUV-PSNR BD-Rate on some video classes.
Outperforms VVC in VMAF BD-Rate on most test sets.
Surpasses existing learned LD and RA solutions in VMAF and YUV BD-Rates.
Abstract
Learned video compression methods already outperform VVC in the low-delay (LD) case, but the random-access (RA) scenario remains challenging. Most works on learned RA video compression either use HEVC as an anchor or compare it to VVC in specific test conditions, using RGB-PSNR metric instead of Y-PSNR and avoiding comprehensive evaluation. Here, we present an end-to-end learned video codec for random access that combines training on long sequences of frames, rate allocation designed for hierarchical coding and content adaptation on inference. We show that under common test conditions (JVET-CTC), it achieves results comparable to VTM (VVC reference software) in terms of YUV-PSNR BD-Rate on some classes of videos, and outperforms it on almost all test sets in terms of VMAF BD-Rate. On average it surpasses open LD and RA end-to-end solutions in terms of VMAF and YUV BD-Rates.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies · Advanced Data Compression Techniques · Advanced Vision and Imaging
