DMTNet: Dynamic Multi-scale Network for Dual-pixel Images Defocus Deblurring with Transformer
Dafeng Zhang, Xiaobing Wang

TL;DR
DMTNet is a novel dynamic multi-scale network combining vision transformers and CNNs for dual-pixel image defocus deblurring, improving robustness and generalization, especially on small datasets.
Contribution
This paper introduces DMTNet, the first to integrate vision transformers with dynamic multi-scale modules for defocus deblurring, enhancing performance and adaptability.
Findings
DMTNet outperforms state-of-the-art methods on benchmark datasets.
The combination of transformer and CNN improves robustness with limited data.
Dynamic multi-scale modules adapt to different blur distributions effectively.
Abstract
Recent works achieve excellent results in defocus deblurring task based on dual-pixel data using convolutional neural network (CNN), while the scarcity of data limits the exploration and attempt of vision transformer in this task. In addition, the existing works use fixed parameters and network architecture to deblur images with different distribution and content information, which also affects the generalization ability of the model. In this paper, we propose a dynamic multi-scale network, named DMTNet, for dual-pixel images defocus deblurring. DMTNet mainly contains two modules: feature extraction module and reconstruction module. The feature extraction module is composed of several vision transformer blocks, which uses its powerful feature extraction capability to obtain richer features and improve the robustness of the model. The reconstruction module is composed of several Dynamic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Layer Normalization · Dense Connections · Residual Connection · Vision Transformer
