Unfolding Framework with Prior of Convolution-Transformer Mixture and   Uncertainty Estimation for Video Snapshot Compressive Imaging

Siming Zheng; Xin Yuan

arXiv:2306.11316·cs.CV·June 21, 2023·1 cites

Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging

Siming Zheng, Xin Yuan

PDF

Open Access

TL;DR

This paper introduces a novel deep unfolding network with a 3D Convolution-Transformer module and uncertainty estimation for video snapshot compressive imaging, achieving state-of-the-art reconstruction quality.

Contribution

It is the first to incorporate Transformer into video SCI reconstruction within a deep unfolding framework, enhancing correlation learning and uncertainty modeling.

Findings

01

Achieves 1.2dB higher PSNR than previous SOTA methods

02

Introduces a 3D Convolution-Transformer module for better spatiotemporal learning

03

Incorporates pixel-wise uncertainty estimation to improve reconstruction quality

Abstract

We consider the problem of video snapshot compressive imaging (SCI), where sequential high-speed frames are modulated by different masks and captured by a single measurement. The underlying principle of reconstructing multi-frame images from only one single measurement is to solve an ill-posed problem. By combining optimization algorithms and neural networks, deep unfolding networks (DUNs) score tremendous achievements in solving inverse problems. In this paper, our proposed model is under the DUN framework and we propose a 3D Convolution-Transformer Mixture (CTM) module with a 3D efficient and scalable attention model plugged in, which helps fully learn the correlation between temporal and spatial dimensions by virtue of Transformer. To our best knowledge, this is the first time that Transformer is employed to video SCI reconstruction. Besides, to further investigate the high-frequency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Advanced MRI Techniques and Applications · Photoacoustic and Ultrasonic Imaging

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Layer Normalization · Adam · Absolute Position Encodings · Softmax · Residual Connection