TL;DR
This paper introduces S2-CoT, a framework with two adapters that co-tune structure and semantics in pre-trained image codecs, achieving state-of-the-art compression with minimal additional parameters.
Contribution
The paper proposes a novel co-tuning framework with synergistic adapters for structure and semantics, improving parameter-efficient image compression.
Findings
Achieves state-of-the-art results across four base codecs.
Maintains high fidelity while reducing trainable parameters.
Enhances entropy model adaptation for better statistical coding.
Abstract
Parameter-efficient fine-tuning of pre-trained codecs is a promising direction in image compression for human and machine vision. While most existing works have primarily focused on tuning the feature structure within the encoder-decoder backbones, the adaptation of the statistical semantics within the entropy model has received limited attention despite its function of predicting the probability distribution of latent features. Our analysis reveals that naive adapter insertion into the entropy model can lead to suboptimal outcomes, underscoring that the effectiveness of adapter-based tuning depends critically on the coordination between adapter type and placement across the compression pipeline. Therefore, we introduce Structure-Semantics Co-Tuning (S2-CoT), a novel framework that achieves this coordination via two specialized, synergistic adapters: the Structural Fidelity Adapter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
