What Matters in Practical Learned Image Compression
Kedar Tatwawadi, Parisa Rahimzadeh, Zhanghao Sun, Zhiqi Chen, Ziyun Yang, Sanjay Nair, Divija Hasteer, Oren Rippel

TL;DR
This paper introduces a practical learned image codec optimized for perceptual quality and runtime, achieving significant bitrate savings and real-time performance on mobile devices.
Contribution
It presents a comprehensive study and neural architecture search to design a new codec that balances perceptual quality, speed, and compression efficiency.
Findings
Achieves 2.3-3x bitrate savings over AV1, AV2, VVC, ECM, JPEG-AI.
Provides 20-40% bitrate savings over top learned codecs.
Encodes 12MP images in 230ms and decodes in 150ms on an iPhone 17 Pro Max.
Abstract
One of the major differentiators unlocked by learned codecs relative to their hard-coded traditional counterparts is their ability to be optimized directly to appeal to the human visual system. Despite this potential, a perceptual yet practical image codec is yet to be proposed. In this work, we aim to close this gap. We conduct a comprehensive study of the key modeling choices that govern the design of a practical learned image codec, jointly optimized for perceptual quality and runtime -- including within the ablations several novel techniques. We then perform performance-aware neural architecture search over millions of backbone configurations to identify models that achieve the target on-device runtime while maximizing compression performance as captured by perceptual metrics. We combine the various optimizations to construct a new codec that achieves a significantly improved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
