Rate distortion optimization over large scale video corpus with machine learning
Sam John, Akshay Gadde, Balu Adsumilli

TL;DR
This paper introduces a machine learning-based method for optimizing bitrate allocation across large video collections, achieving significant bitrate savings while maintaining quality.
Contribution
It presents a novel clustering and classification approach for rate-distortion characteristics to optimize encoder settings across large video datasets.
Findings
Achieves 22% bitrate reduction at the same quality level.
Uses simple features for fast R-D cluster prediction.
Effective for large-scale video corpus bitrate management.
Abstract
We present an efficient codec-agnostic method for bitrate allocation over a large scale video corpus with the goal of minimizing the average bitrate subject to constraints on average and minimum quality. Our method clusters the videos in the corpus such that videos within one cluster have similar rate-distortion (R-D) characteristics. We train a support vector machine classifier to predict the R-D cluster of a video using simple video complexity features that are computationally easy to obtain. The model allows us to classify a large sample of the corpus in order to estimate the distribution of the number of videos in each of the clusters. We use this distribution to find the optimal encoder operating point for each R-D cluster. Experiments with AV1 encoder show that our method can achieve the same average quality over the corpus with less average bitrate.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Video Coding and Compression Technologies · Advanced Image Processing Techniques
