TL;DR
This paper introduces a dual-stream architecture for learned data compression that improves efficiency, reduces latency, and enhances compression performance by disentangling local and global features.
Contribution
It proposes a novel dual-stream multi-scale decoupler and hierarchical refiner to replace serial processing, enabling parallelism and better feature modeling in data compression.
Findings
Achieves state-of-the-art compression ratio and throughput.
Maintains lowest latency and memory usage.
Demonstrates effectiveness through extensive experiments.
Abstract
While Learned Data Compression (LDC) has achieved superior compression ratios, balancing precise probability modeling with system efficiency remains challenging. Crucially, uniform single-stream architectures struggle to simultaneously capture micro-syntactic and macro-semantic features, necessitating deep serial stacking that exacerbates latency. Compounding this, heterogeneous systems are constrained by device speed mismatches, where throughput is capped by Amdahl's Law due to serial processing. To this end, we propose a Dual-Stream Multi-Scale Decoupler that disentangles local and global contexts to replace deep serial processing with shallow parallel streams, and incorporate a Hierarchical Gated Refiner for adaptive feature refinement and precise probability modeling. Furthermore, we design a Concurrent Stream-Parallel Pipeline, which overcomes systemic bottlenecks to achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
