Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging
Siming Zheng, Xin Yuan

TL;DR
This paper introduces a novel deep unfolding network with a 3D Convolution-Transformer module and uncertainty estimation for video snapshot compressive imaging, achieving state-of-the-art reconstruction quality.
Contribution
It is the first to incorporate Transformer into video SCI reconstruction within a deep unfolding framework, enhancing correlation learning and uncertainty modeling.
Findings
Achieves 1.2dB higher PSNR than previous SOTA methods
Introduces a 3D Convolution-Transformer module for better spatiotemporal learning
Incorporates pixel-wise uncertainty estimation to improve reconstruction quality
Abstract
We consider the problem of video snapshot compressive imaging (SCI), where sequential high-speed frames are modulated by different masks and captured by a single measurement. The underlying principle of reconstructing multi-frame images from only one single measurement is to solve an ill-posed problem. By combining optimization algorithms and neural networks, deep unfolding networks (DUNs) score tremendous achievements in solving inverse problems. In this paper, our proposed model is under the DUN framework and we propose a 3D Convolution-Transformer Mixture (CTM) module with a 3D efficient and scalable attention model plugged in, which helps fully learn the correlation between temporal and spatial dimensions by virtue of Transformer. To our best knowledge, this is the first time that Transformer is employed to video SCI reconstruction. Besides, to further investigate the high-frequency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Advanced MRI Techniques and Applications · Photoacoustic and Ultrasonic Imaging
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Layer Normalization · Adam · Absolute Position Encodings · Softmax · Residual Connection
