PtychoDV: Vision Transformer-Based Deep Unrolling Network for Ptychographic Image Reconstruction
Weijie Gan, Qiuchen Zhai, Michael Thompson McCann, Cristina Garcia Cardona, Ulugbek S. Kamilov, Brendt Wohlberg

TL;DR
PtychoDV is a vision transformer-based deep unrolling network that efficiently reconstructs high-quality ptychographic images, outperforming existing methods and reducing computational costs.
Contribution
The paper introduces PtychoDV, a novel deep learning model combining vision transformers and unrolling techniques for improved ptychographic image reconstruction.
Findings
Outperforms existing deep learning methods in quality.
Reduces computational cost compared to iterative algorithms.
Maintains competitive reconstruction performance.
Abstract
Ptychography is an imaging technique that captures multiple overlapping snapshots of a sample, illuminated coherently by a moving localized probe. The image recovery from ptychographic data is generally achieved via an iterative algorithm that solves a nonlinear phase retrieval problem derived from measured diffraction patterns. However, these iterative approaches have high computational cost. In this paper, we introduce PtychoDV, a novel deep model-based network designed for efficient, high-quality ptychographic image reconstruction. PtychoDV comprises a vision transformer that generates an initial image from the set of raw measurements, taking into consideration their mutual correlations. This is followed by a deep unrolling network that refines the initial image using learnable convolutional priors and the ptychography measurement model. Experimental results on simulated data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced X-ray Imaging Techniques · Nuclear Physics and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Layer Normalization · Dense Connections · Softmax · Vision Transformer
