LL-ICM: Image Compression for Low-level Machine Vision via Large   Vision-Language Model

Yuan Xue; Qi Zhang; Chuanmin Jia; Shiqi Wang

arXiv:2412.03841·cs.CV·December 6, 2024

LL-ICM: Image Compression for Low-level Machine Vision via Large Vision-Language Model

Yuan Xue, Qi Zhang, Chuanmin Jia, Shiqi Wang

PDF

Open Access

TL;DR

This paper introduces LL-ICM, a novel image compression framework optimized for low-level machine vision tasks, integrating vision-language models to enhance robustness and generalization across multiple tasks.

Contribution

The paper pioneers a joint optimization framework for image compression tailored to low-level vision tasks, incorporating vision-language models for improved robustness and versatility.

Findings

01

Achieves 22.65% BD-rate reduction over state-of-the-art methods.

02

Enables a single codec to generalize across multiple low-level vision tasks.

03

Provides extensive objective evaluation using full and no-reference image quality assessments.

Abstract

Image Compression for Machines (ICM) aims to compress images for machine vision tasks rather than human viewing. Current works predominantly concentrate on high-level tasks like object detection and semantic segmentation. However, the quality of original images is usually not guaranteed in the real world, leading to even worse perceptual quality or downstream task performance after compression. Low-level (LL) machine vision models, like image restoration models, can help improve such quality, and thereby their compression requirements should also be considered. In this paper, we propose a pioneered ICM framework for LL machine vision tasks, namely LL-ICM. By jointly optimizing compression and LL tasks, the proposed LL-ICM not only enriches its encoding ability in generalizing to versatile LL tasks but also optimizes the processing ability of down-stream LL task models, achieving mutual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Medical Image Segmentation Techniques