Hybrid Transformer and CNN Attention Network for Stereo Image   Super-resolution

Ming Cheng; Haoyu Ma; Qiufang Ma; Xiaopeng Sun; Weiqi Li; Zhenyu; Zhang; Xuhan Sheng; Shijie Zhao; Junlin Li; Li Zhang

arXiv:2305.05177·cs.CV·May 10, 2023·2 cites

Hybrid Transformer and CNN Attention Network for Stereo Image Super-resolution

Ming Cheng, Haoyu Ma, Qiufang Ma, Xiaopeng Sun, Weiqi Li, Zhenyu, Zhang, Xuhan Sheng, Shijie Zhao, Junlin Li, Li Zhang

PDF

Open Access

TL;DR

This paper introduces HTCAN, a hybrid transformer and CNN network for stereo image super-resolution, effectively leveraging stereo information and advanced training strategies to outperform existing methods.

Contribution

The paper proposes a novel hybrid network combining transformers and CNNs for stereo super-resolution, addressing limitations of existing transformer-based methods.

Findings

01

Achieved 23.90dB PSNR in NTIRE 2023 challenge

02

Outperformed existing stereo super-resolution methods

03

Utilized multi-patch training and larger window sizes

Abstract

Multi-stage strategies are frequently employed in image restoration tasks. While transformer-based methods have exhibited high efficiency in single-image super-resolution tasks, they have not yet shown significant advantages over CNN-based methods in stereo super-resolution tasks. This can be attributed to two key factors: first, current single-image super-resolution transformers are unable to leverage the complementary stereo information during the process; second, the performance of transformers is typically reliant on sufficient data, which is absent in common stereo-image super-resolution algorithms. To address these issues, we propose a Hybrid Transformer and CNN Attention Network (HTCAN), which utilizes a transformer-based network for single-image enhancement and a CNN-based network for stereo information fusion. Furthermore, we employ a multi-patch training strategy and larger…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Image Processing Techniques and Applications · Advanced Vision and Imaging

MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Linear Layer · Label Smoothing · Dropout · Byte Pair Encoding · Dense Connections · Residual Connection · Adam