Disparity-based Stereo Image Compression with Aligned Cross-View Priors
Yongqi Zhai, Luyang Tang, Yi Ma, Rui Peng, Ronggang Wang

TL;DR
This paper introduces DispSIC, a deep neural network for stereo image compression that leverages disparity-guided residual encoding and aligned cross-view priors to improve rate-distortion efficiency.
Contribution
It proposes a novel end-to-end trainable framework combining stereo matching and residual encoding with a cross-view prior-based entropy model for enhanced stereo image compression.
Findings
Outperforms existing SIC methods on KITTI and InStereo2K datasets.
Achieves better rate-distortion trade-offs with adaptive bitrate allocation.
Utilizes disparity information and aligned priors for improved probability estimation.
Abstract
With the wide application of stereo images in various fields, the research on stereo image compression (SIC) attracts extensive attention from academia and industry. The core of SIC is to fully explore the mutual information between the left and right images and reduce redundancy between views as much as possible. In this paper, we propose DispSIC, an end-to-end trainable deep neural network, in which we jointly train a stereo matching model to assist in the image compression task. Based on the stereo matching results (i.e. disparity), the right image can be easily warped to the left view, and only the residuals between the left and right views are encoded for the left image. A three-branch auto-encoder architecture is adopted in DispSIC, which encodes the right image, the disparity map and the residuals respectively. During training, the whole network can learn how to adaptively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
