Accelerating JPEG Decompression on GPUs
Andr\'e Wei{\ss}enberger, Bertil Schmidt

TL;DR
This paper introduces a GPU-accelerated JPEG decompression method that significantly outperforms existing CPU and GPU implementations, reducing bottlenecks in GPU-based image processing pipelines.
Contribution
The paper presents a novel parallel JPEG decoding algorithm for GPUs that evaluates codewords at arbitrary positions, enabling efficient independent chunk decompression.
Findings
Outperforms libjpeg-turbo by up to 51 times on A100 GPU
Outperforms nvJPEG by up to 8 times on A100 GPU
Achieves 3.4 times speedup over nvJPEG with hardware JPEG decoder
Abstract
The JPEG compression format has been the standard for lossy image compression for over multiple decades, offering high compression rates at minor perceptual loss in image quality. For GPU-accelerated computer vision and deep learning tasks, such as the training of image classification models, efficient JPEG decoding is essential due to limitations in memory bandwidth. As many decoder implementations are CPU-based, decoded image data has to be transferred to accelerators like GPUs via interconnects such as PCI-E, implying decreased throughput rates. JPEG decoding therefore represents a considerable bottleneck in these pipelines. In contrast, efficiency could be vastly increased by utilizing a GPU-accelerated decoder. In this case, only compressed data needs to be transferred, as decoding will be handled by the accelerators. In order to design such a GPU-based decoder, the respective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Advanced Image and Video Retrieval Techniques · Algorithms and Data Compression
