Convolutional Neural Network (CNN) vs Vision Transformer (ViT) for   Digital Holography

St\'ephane Cuenat; Rapha\"el Couturier

arXiv:2108.09147·cs.CV·January 28, 2022

Convolutional Neural Network (CNN) vs Vision Transformer (ViT) for Digital Holography

St\'ephane Cuenat, Rapha\"el Couturier

PDF

Open Access

TL;DR

This paper compares CNN and ViT deep learning architectures for auto-focusing in digital holography, demonstrating that ViT achieves high accuracy and robustness with finer distance classification than previous methods.

Contribution

It introduces a novel application of ViT for auto-focusing in digital holography, achieving 1μm classification granularity and improved robustness over CNN.

Findings

01

ViT achieves similar accuracy to CNN in auto-focusing.

02

ViT is more robust than CNN.

03

The classification granularity is improved to 1μm.

Abstract

In Digital Holography (DH), it is crucial to extract the object distance from a hologram in order to reconstruct its amplitude and phase. This step is called auto-focusing and it is conventionally solved by first reconstructing a stack of images and then by sharpening each reconstructed image using a focus metric such as entropy or variance. The distance corresponding to the sharpest image is considered the focal position. This approach, while effective, is computationally demanding and time-consuming. In this paper, the determination of the distance is performed by Deep Learning (DL). Two deep learning (DL) architectures are compared: Convolutional Neural Network (CNN) and Vision Transformer (ViT). ViT and CNN are used to cope with the problem of auto-focusing as a classification problem. Compared to a first attempt [11] in which the distance between two consecutive classes was…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Digital Holography and Microscopy · Cell Image Analysis Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Label Smoothing · Residual Connection · Layer Normalization · Dense Connections · Adam · Absolute Position Encodings