DMTNet: Dynamic Multi-scale Network for Dual-pixel Images Defocus   Deblurring with Transformer

Dafeng Zhang; Xiaobing Wang

arXiv:2209.06040·cs.CV·September 19, 2022

DMTNet: Dynamic Multi-scale Network for Dual-pixel Images Defocus Deblurring with Transformer

Dafeng Zhang, Xiaobing Wang

PDF

TL;DR

DMTNet is a novel dynamic multi-scale network combining vision transformers and CNNs for dual-pixel image defocus deblurring, improving robustness and generalization, especially on small datasets.

Contribution

This paper introduces DMTNet, the first to integrate vision transformers with dynamic multi-scale modules for defocus deblurring, enhancing performance and adaptability.

Findings

01

DMTNet outperforms state-of-the-art methods on benchmark datasets.

02

The combination of transformer and CNN improves robustness with limited data.

03

Dynamic multi-scale modules adapt to different blur distributions effectively.

Abstract

Recent works achieve excellent results in defocus deblurring task based on dual-pixel data using convolutional neural network (CNN), while the scarcity of data limits the exploration and attempt of vision transformer in this task. In addition, the existing works use fixed parameters and network architecture to deblur images with different distribution and content information, which also affects the generalization ability of the model. In this paper, we propose a dynamic multi-scale network, named DMTNet, for dual-pixel images defocus deblurring. DMTNet mainly contains two modules: feature extraction module and reconstruction module. The feature extraction module is composed of several vision transformer blocks, which uses its powerful feature extraction capability to obtain richer features and improve the robustness of the model. The reconstruction module is composed of several Dynamic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Layer Normalization · Dense Connections · Residual Connection · Vision Transformer