PtychoDV: Vision Transformer-Based Deep Unrolling Network for Ptychographic Image Reconstruction

Weijie Gan; Qiuchen Zhai; Michael Thompson McCann; Cristina Garcia Cardona; Ulugbek S. Kamilov; Brendt Wohlberg

arXiv:2310.07504·eess.IV·October 6, 2025·2 cites

PtychoDV: Vision Transformer-Based Deep Unrolling Network for Ptychographic Image Reconstruction

Weijie Gan, Qiuchen Zhai, Michael Thompson McCann, Cristina Garcia Cardona, Ulugbek S. Kamilov, Brendt Wohlberg

PDF

Open Access 1 Repo

TL;DR

PtychoDV is a vision transformer-based deep unrolling network that efficiently reconstructs high-quality ptychographic images, outperforming existing methods and reducing computational costs.

Contribution

The paper introduces PtychoDV, a novel deep learning model combining vision transformers and unrolling techniques for improved ptychographic image reconstruction.

Findings

01

Outperforms existing deep learning methods in quality.

02

Reduces computational cost compared to iterative algorithms.

03

Maintains competitive reconstruction performance.

Abstract

Ptychography is an imaging technique that captures multiple overlapping snapshots of a sample, illuminated coherently by a moving localized probe. The image recovery from ptychographic data is generally achieved via an iterative algorithm that solves a nonlinear phase retrieval problem derived from measured diffraction patterns. However, these iterative approaches have high computational cost. In this paper, we introduce PtychoDV, a novel deep model-based network designed for efficient, high-quality ptychographic image reconstruction. PtychoDV comprises a vision transformer that generates an initial image from the set of raw measurements, taking into consideration their mutual correlations. This is followed by a deep unrolling network that refines the initial image using learnable convolutional priors and the ptychography measurement model. Experimental results on simulated data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wjgancn/ptychodv
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced X-ray Imaging Techniques · Nuclear Physics and Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Layer Normalization · Dense Connections · Softmax · Vision Transformer