Recent Standard Development Activities on Video Coding for Machines
Wen Gao, Shan Liu, Xiaozhong Xu, Manouchehr Rafie, Yuan Zhang, Igor, Curcio

TL;DR
This paper reviews recent activities in developing video coding standards optimized for machine vision applications, highlighting use cases, evaluation frameworks, and technological proposals within MPEG's VCM group.
Contribution
It provides a comprehensive overview of the MPEG VCM group's recent standardization efforts, including use cases, evaluation methods, and proposed technological solutions.
Findings
Overview of MPEG VCM group activities and plans
Evaluation framework for machine vision tasks in video coding
Discussion of recent technological proposals and responses
Abstract
In recent years, video data has dominated internet traffic and becomes one of the major data formats. With the emerging 5G and internet of things (IoT) technologies, more and more videos are generated by edge devices, sent across networks, and consumed by machines. The volume of video consumed by machine is exceeding the volume of video consumed by humans. Machine vision tasks include object detection, segmentation, tracking, and other machine-based applications, which are quite different from those for human consumption. On the other hand, due to large volumes of video data, it is essential to compress video before transmission. Thus, efficient video coding for machines (VCM) has become an important topic in academia and industry. In July 2019, the international standardization organization, i.e., MPEG, created an Ad-Hoc group named VCM to study the requirements for potential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis
