CATformer: Contrastive Adversarial Transformer for Image Super-Resolution

Qinyi Tian; Spence Cox; Laura E. Dalton

arXiv:2508.17708·cs.CV·August 26, 2025

CATformer: Contrastive Adversarial Transformer for Image Super-Resolution

Qinyi Tian, Spence Cox, Laura E. Dalton

PDF

TL;DR

CATformer is a novel neural network that combines diffusion-inspired transformers with adversarial and contrastive learning to significantly improve image super-resolution quality and robustness, outperforming recent methods.

Contribution

Introduces CATformer, a dual-branch transformer architecture integrating diffusion-inspired refinement with adversarial and contrastive learning for superior super-resolution.

Findings

01

Outperforms recent transformer-based and diffusion-inspired methods in efficiency and quality.

02

Demonstrates robustness to noise through learned latent contrasts.

03

Bridges the gap among transformer, diffusion, and GAN-based super-resolution methods.

Abstract

Super-resolution remains a promising technique to enhance the quality of low-resolution images. This study introduces CATformer (Contrastive Adversarial Transformer), a novel neural network integrating diffusion-inspired feature refinement with adversarial and contrastive learning. CATformer employs a dual-branch architecture combining a primary diffusion-inspired transformer, which progressively refines latent representations, with an auxiliary transformer branch designed to enhance robustness to noise through learned latent contrasts. These complementary representations are fused and decoded using deep Residual-in-Residual Dense Blocks for enhanced reconstruction quality. Extensive experiments on benchmark datasets demonstrate that CATformer outperforms recent transformer-based and diffusion-inspired methods both in efficiency and visual image quality. This work bridges the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.