JPEG Inspired Deep Learning

Ahmed H. Salamah; Kaixiang Zheng; Yiwen Liu; En-Hui Yang

arXiv:2410.07081·cs.CV·March 24, 2025

JPEG Inspired Deep Learning

Ahmed H. Salamah, Kaixiang Zheng, Yiwen Liu, En-Hui Yang

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

This paper introduces JPEG-DL, a deep learning framework with a trainable JPEG compression layer that improves accuracy and robustness of neural networks, challenging the belief that compression harms DL performance.

Contribution

It proposes a novel trainable JPEG compression layer using a differentiable soft quantizer, jointly trained with DNNs to enhance accuracy and robustness.

Findings

01

JPEG-DL improves accuracy by up to 20.9% on some datasets.

02

JPEG-DL enhances robustness against adversarial attacks.

03

Significant accuracy gains across various datasets and architectures.

Abstract

Although it is traditionally believed that lossy image compression, such as JPEG compression, has a negative impact on the performance of deep neural networks (DNNs), it is shown by recent works that well-crafted JPEG compression can actually improve the performance of deep learning (DL). Inspired by this, we propose JPEG-DL, a novel DL framework that prepends any underlying DNN architecture with a trainable JPEG compression layer. To make the quantization operation in JPEG compression trainable, a new differentiable soft quantizer is employed at the JPEG layer, and then the quantization operation and underlying DNN are jointly trained. Extensive experiments show that in comparison with the standard DL, JPEG-DL delivers significant accuracy improvements across various datasets and model architectures while enhancing robustness against adversarial attacks. Particularly, on some…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 3

Strengths

1. Overall, the paper is written clearly, and can be easily understood. 2. A variety of experiments are conducted to verify the effectiveness. 3. The used technique seems sound, although I don’t check it in detail.

Weaknesses

1. I don’t think the problem studied in this paper is very important in the community. Usually, the images fed into deep learning have been precessed by the fixed JPGE compressor, and we don’t have the chance to modify the process like that in this pager, so the actual application value is limited. In addition, I see the paper is an improvement for the method in Yang 2021 (Entropy), which is not followed by many researchers. 2. Now lots of papers have shown the vision transformer and large mod

Reviewer 02Rating 6Confidence 2

Strengths

1. To make JPEG trainable, a differentiable soft quantizer is proposed. It works well with JPEG. Overall, this paper makes JPEG trainable which is significant contribution. Because many frameworks equipped with JPEG can be trained by the differentiable soft quantizer. 2. A novel DL framework that prepends any underlying DNN architecture with a trainable JPEG compression layer is proposed. Experiments show it can improve the accuracy significantly with only 128 parameters. 3. This paper enjoys a

Weaknesses

1. It is better to make a comparison for the latency. The speed of the model is also important to report. 2. Only image classification is considered. The proposed method is better to be validated on more tasks, such as object detection and segementation. 3. Hyperparameters are tuned differently on different datasets.

Reviewer 03Rating 6Confidence 4

Strengths

- The approach of leveraging image compression to improve pure performance of a model is interesting. - The paper is well-written and it is easy to follow. - A variety of network architectures and datasets are used in the experiments.

Weaknesses

Major concerns: - A comparison to training with JPEG-based data augmentation is required to validate that the proposed method provides benefits beyond simple data augmentation using JPEG. - There is no baseline for the differentiable quantizer. For example, comparisons could be made with methods such as the straight-through estimator (i.e., using the identity function as a gradient function) or additive uniform noise [1]. - The baseline for the image preprocessing method for training is insuffi

Code & Models

Repositories

jpeginspireddl/jpeg-inspired-dl
pytorchOfficial

Videos

JPEG Inspired Deep Learning· slideslive

Taxonomy

TopicsComputational Physics and Python Applications