Fourier Image Transformer
Tim-Oliver Buchholz, Florian Jug

TL;DR
The paper introduces Fourier Image Transformer (FIT), a novel transformer-based approach that operates in Fourier space for image analysis, enabling tasks like image completion and CT reconstruction by predicting Fourier coefficients from low-resolution inputs.
Contribution
It proposes a new Fourier Domain Encoding method for transformers, allowing effective image completion and reconstruction directly in Fourier space, which is inaccessible to traditional convolutional models.
Findings
Effective in image completion tasks with low-resolution inputs
Enables CT image reconstruction from Fourier domain observations
Operates successfully in Fourier space, bypassing convolutional limitations
Abstract
Transformer architectures show spectacular performance on NLP tasks and have recently also been used for tasks such as image completion or image classification. Here we propose to use a sequential image representation, where each prefix of the complete sequence describes the whole image at reduced resolution. Using such Fourier Domain Encodings (FDEs), an auto-regressive image completion task is equivalent to predicting a higher resolution output given a low-resolution input. Additionally, we show that an encoder-decoder setup can be used to query arbitrary Fourier coefficients given a set of Fourier domain observations. We demonstrate the practicality of this approach in the context of computed tomography (CT) image reconstruction. In summary, we show that Fourier Image Transformer (FIT) can be used to solve relevant image analysis tasks in Fourier space, a domain inherently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Seismic Imaging and Inversion Techniques · Reservoir Engineering and Simulation Methods
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Dropout · Attention Is All You Need · Byte Pair Encoding · Residual Connection · Layer Normalization · Label Smoothing
