Scalable Image Coding for Humans and Machines

Hyomin Choi; Ivan V. Bajic

arXiv:2107.08373·eess.IV·April 13, 2022·IEEE Trans. Image Process.

Scalable Image Coding for Humans and Machines

Hyomin Choi, Ivan V. Bajic

PDF

1 Repo

TL;DR

This paper introduces a scalable learned image codec designed for both human viewing and machine vision tasks, achieving significant bitrate savings for machine tasks while maintaining high reconstruction quality.

Contribution

It presents a novel end-to-end learned image codec with a layered latent space supporting scalable tasks, a new approach for efficient image compression for mixed human and machine use.

Findings

01

37%-80% bitrate savings on machine vision tasks

02

Comparable reconstruction quality to state-of-the-art codecs

03

Effective multi-layer scalability for diverse tasks

Abstract

At present, and increasingly so in the future, much of the captured visual content will not be seen by humans. Instead, it will be used for automated machine vision analytics and may require occasional human viewing. Examples of such applications include traffic monitoring, visual surveillance, autonomous navigation, and industrial machine vision. To address such requirements, we develop an end-to-end learned image codec whose latent space is designed to support scalability from simpler to more complicated tasks. The simplest task is assigned to a subset of the latent space (the base layer), while more complicated tasks make use of additional subsets of the latent space, i.e., both the base and enhancement layer(s). For the experiments, we establish a 2-layer and a 3-layer model, each of which offers input reconstruction for human vision, plus machine vision task(s), and compare them…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

InterDigitalInc/CompressAI-Vision/blob/main/compressai_vision/codecs/sic_sfu2022.py
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.