The Practice of Averaging Rate-Distortion Curves over Testsets to Compare Learned Video Codecs Can Cause Misleading Conclusions
M.Akin Yilmaz, Onur Kele\c{s}, A.Murat Tekalp

TL;DR
Averaging rate-distortion curves over testsets in learned video codecs can mislead evaluations, and reporting per-sequence metrics provides a more accurate comparison, aligning with traditional practices.
Contribution
This paper highlights the pitfalls of averaging RD curves in learned video codec evaluation and advocates for per-sequence metrics to improve fairness and accuracy.
Findings
Averaged RD curves can be disproportionately influenced by outlier videos.
Per-sequence metrics provide more reliable comparisons than averaged RD curves.
Traditional per-sequence evaluation practices are recommended for learned codecs.
Abstract
This paper aims to demonstrate how the prevalent practice in the learned video compression community of averaging rate-distortion (RD) curves across a test video set can lead to misleading conclusions in evaluating codec performance. Through analytical analysis of a simple case and experimental results with two recent learned video codecs, we show how averaged RD curves can mislead comparative evaluation of different codecs, particularly when videos in a dataset have varying characteristics and operating ranges. We illustrate how a single video with distinct RD characteristics from the rest of the test set can disproportionately influence the average RD curve, potentially overshadowing a codec's superior performance across most individual sequences. Using two recent learned video codecs on the UVG dataset as a case study, we demonstrate computing performance metrics, such as the BD…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies
MethodsSparse Evolutionary Training
