CNNs for JPEGs: A Study in Computational Cost
Samuel Felipe dos Santos, Nicu Sebe, and Jurandy Almeida

TL;DR
This paper investigates the computational costs of CNNs operating directly on JPEG compressed images and proposes techniques to reduce complexity while maintaining accuracy.
Contribution
It provides a detailed analysis of decoding and model passing costs and introduces methods to optimize frequency domain CNNs for efficiency.
Findings
Decoding JPEG images adds significant computational load.
Proposed techniques reduce model complexity without sacrificing accuracy.
Efficient models achieve better cost-accuracy trade-offs.
Abstract
Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, defining state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from the RGB pixels. However, most image data are usually available in compressed format, from which the JPEG is the most widely used due to transmission and storage purposes demanding a preliminary decoding process that have a high computational load and memory usage. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. Those methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNNs architectures to work with them. One limitation of these current works is that, in order to accommodate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
