Transformer Compressed Sensing via Global Image Tokens
Marlon Bran Lorenzana, Craig Engstrom, Shekhar S. Chandra

TL;DR
This paper introduces Kaleidoscope tokens, a novel image decomposition method for vision Transformers that enables global attention in compressed sensing tasks, improving MRI reconstruction quality and efficiency.
Contribution
It proposes Kaleidoscope tokens for global attention in vision Transformers, replacing CNNs in CS-MRI models, and introduces an ensemble of tokens to enhance image quality and reduce model size.
Findings
Kaleidoscope tokens enable global attention with low computational cost.
Replacing CNNs with TNN blocks improves MRI reconstruction quality.
Ensemble tokens further enhance image quality and reduce model size.
Abstract
Convolutional neural networks (CNN) have demonstrated outstanding Compressed Sensing (CS) performance compared to traditional, hand-crafted methods. However, they are broadly limited in terms of generalisability, inductive bias and difficulty to model long distance relationships. Transformer neural networks (TNN) overcome such issues by implementing an attention mechanism designed to capture dependencies between inputs. However, high-resolution tasks typically require vision Transformers (ViT) to decompose an image into patch-based tokens, limiting inputs to inherently local contexts. We propose a novel image decomposition that naturally embeds images into low-resolution inputs. These Kaleidoscope tokens (KD) provide a mechanism for global attention, at the same computational cost as a patch-based approach. To showcase this development, we replace CNN components in a well-known CS-MRI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Photoacoustic and Ultrasonic Imaging · Image and Signal Denoising Methods
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Dense Connections · Residual Connection · Softmax · Label Smoothing · Dropout
