A Preprocessing Framework for Video Machine Vision under Compression
Fei Zhao, Mengxi Guo, Shijie Zhao, Junlin Li, Li Zhang, Xiaodong Xie

TL;DR
This paper introduces a neural preprocessing framework with a differentiable virtual codec to optimize video compression specifically for machine vision tasks, improving rate-accuracy performance and reducing bitrate by over 15%.
Contribution
It presents a novel preprocessing method with a differentiable virtual codec tailored for machine vision, enhancing compression efficiency without altering standard codecs.
Findings
Achieves over 15% bitrate savings compared to standard codecs.
Boosts rate-accuracy performance for machine vision tasks.
Applicable to real-world scenarios with standard codecs.
Abstract
There has been a growing trend in compressing and transmitting videos from terminals for machine vision tasks. Nevertheless, most video coding optimization method focus on minimizing distortion according to human perceptual metrics, overlooking the heightened demands posed by machine vision systems. In this paper, we propose a video preprocessing framework tailored for machine vision tasks to address this challenge. The proposed method incorporates a neural preprocessor which retaining crucial information for subsequent tasks, resulting in the boosting of rate-accuracy performance. We further introduce a differentiable virtual codec to provide constraints on rate and distortion during the training stage. We directly apply widely used standard codecs for testing. Therefore, our solution can be easily applied to real-world scenarios. We conducted extensive experiments evaluating our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies · Advanced Data Compression Techniques · Image and Video Quality Assessment
