TL;DR
This paper introduces a data-driven method to optimize frequency-specific quantization scaling lists in VVC for neural network-based image analysis, achieving significant bitrate savings over standard VVC and human-visual-system-based scaling.
Contribution
It presents a novel approach to derive optimal scaling lists for neural networks in VVC, improving compression efficiency for task-driven image coding.
Findings
Achieves up to 8.9% bitrate savings over standard VVC.
Outperforms scaling lists optimized for human visual perception.
Provides publicly available scaling lists for practical use.
Abstract
Today, visual data is often analyzed by a neural network without any human being involved, which demands for specialized codecs. For standard-compliant codec adaptations towards certain information sinks, HEVC or VVC provide the possibility of frequency-specific quantization with scaling lists. This is a well-known method for the human visual system, where scaling lists are derived from psycho-visual models. In this work, we employ scaling lists when performing VVC intra coding for neural networks as information sink. To this end, we propose a novel data-driven method to obtain optimal scaling lists for arbitrary neural networks. Experiments with Mask R-CNN as information sink reveal that coding the Cityscapes dataset with the proposed scaling lists result in peak bitrate savings of 8.9 % over VVC with constant quantization. By that, our approach also outperforms scaling lists optimized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Convolution · Region Proposal Network · RoIAlign · Mask R-CNN
