Content-Driven Frame-Level Bit Prediction for Rate Control in Versatile Video Coding
Amritha Premkumar, Prajit T Rajendran, Vignesh V Menon, Christian Herglotz

TL;DR
This paper introduces a content-adaptive, machine learning-based frame-level bit prediction method for VVC rate control, improving efficiency and stability over traditional models by using lightweight features and Random Forest regression.
Contribution
It presents a novel framework that predicts frame bits using lightweight features and machine learning, reducing complexity and improving accuracy in rate control for VVC.
Findings
Achieves high correlation with ground truth bit consumption (R2 up to 0.93).
Reduces total encoding time by 33.3% compared to conventional methods.
Maintains comparable coding efficiency to two-pass rate control.
Abstract
Rate control allocates bits efficiently across frames to meet a target bitrate while maintaining quality. Conventional two-pass rate control (2pRC) in Versatile Video Coding (VVC) relies on analytical rate-QP models, which often fail to capture nonlinear spatial-temporal variations, causing quality instability and high complexity due to multiple trial encodes. This paper proposes a content-adaptive framework that predicts frame-level bit consumption using lightweight features from the Video Complexity Analyzer (VCA) and quantization parameters within a Random Forest regression. On ultra-high-definition sequences encoded with VVenC, the model achieves strong correlation with ground truth, yielding R2 values of 0.93, 0.88, and 0.77 for I-, P-, and B-frames, respectively. Integrated into a rate-control loop, it achieves comparable coding efficiency to 2pRC while reducing total encoding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies · Image and Video Quality Assessment · Visual Attention and Saliency Detection
