A Low-Resolution Image is Worth 1x1 Words: Enabling Fine Image Super-Resolution with Transformers and TaylorShift
Sanath Budakegowdanadoddi Nagaraju, Brian Bernhard Moser, Tobias Christian Nauen, Stanislav Frolov, Federico Raue, and Andreas Dengel

TL;DR
TaylorIR introduces a novel transformer framework for image super-resolution that uses 1x1 patch embeddings and a Taylor-series-based attention mechanism, achieving state-of-the-art results with reduced memory usage.
Contribution
The paper presents TaylorIR, a new SR model that enforces pixel-wise reasoning and employs TaylorShift attention for efficient, high-quality image reconstruction.
Findings
Achieves state-of-the-art super-resolution performance.
Reduces memory consumption by up to 60%.
Effectively balances detail restoration and model efficiency.
Abstract
Transformer-based architectures have recently advanced the image reconstruction quality of super-resolution (SR) models. Yet, their scalability remains limited by quadratic attention costs and coarse patch embeddings that weaken pixel-level fidelity. We propose TaylorIR, a plug-and-play framework that enforces 1x1 patch embeddings for true pixel-wise reasoning and replaces conventional self-attention with TaylorShift, a Taylor-series-based attention mechanism enabling full token interactions with near-linear complexity. Across multiple SR benchmarks, TaylorIR delivers state-of-the-art performance while reducing memory consumption by up to 60%, effectively bridging the gap between fine-grained detail restoration and efficient transformer scaling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques
MethodsSoftmax · Attention Is All You Need
