Jet: A Modern Transformer-Based Normalizing Flow
Alexander Kolesnikov, Andr\'e Susano Pinto, Michael Tschannen

TL;DR
This paper introduces Jet, a transformer-based normalizing flow model that achieves state-of-the-art performance in image generation, demonstrating the potential of transformer architectures in normalizing flows.
Contribution
The paper presents a novel transformer-based normalizing flow architecture that outperforms previous models in image generation tasks.
Findings
Achieved state-of-the-art quantitative results.
Produced high-quality visual samples.
Simplified architecture compared to prior models.
Abstract
In the past, normalizing generative flows have emerged as a promising class of generative models for natural images. This type of model has many modeling advantages: the ability to efficiently compute log-likelihood of the input data, fast generation and simple overall structure. Normalizing flows remained a topic of active research but later fell out of favor, as visual quality of the samples was not competitive with other model classes, such as GANs, VQ-VAE-based approaches or diffusion models. In this paper we revisit the design of the coupling-based normalizing flow models by carefully ablating prior design choices and using computational blocks based on the Vision Transformer architecture, not convolutional neural networks. As a result, we achieve state-of-the-art quantitative and qualitative performance with a much simpler architecture. While the overall visual quality is still…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAerodynamics and Acoustics in Jet Flows · Fluid Dynamics and Turbulent Flows · Flow Measurement and Analysis
MethodsLinear Layer · Vision Transformer · Dropout · Diffusion · Multi-Head Attention · Adam · Layer Normalization · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection
