Generalised Image Outpainting with U-Transformer
Penglei Gao, Xi Yang, Rui Zhang, John Y. Goulermas, Yujie Geng, Yuyao, Yan, and Kaizhu Huang

TL;DR
This paper introduces U-Transformer, a transformer-based GAN that performs all-side image outpainting with long-range dependency modeling, producing realistic extended images for complex scenes.
Contribution
The paper presents a novel U-shaped transformer generator with multi-view TSP for flexible, all-side image outpainting, outperforming existing methods in visual quality.
Findings
Produces visually appealing outpainted images
Handles complex scenes and structures effectively
Allows arbitrary outpainting sizes during testing
Abstract
In this paper, we develop a novel transformer-based generative adversarial neural network called U-Transformer for generalised image outpainting problem. Different from most present image outpainting methods conducting horizontal extrapolation, our generalised image outpainting could extrapolate visual context all-side around a given image with plausible structure and details even for complicated scenery, building, and art images. Specifically, we design a generator as an encoder-to-decoder structure embedded with the popular Swin Transformer blocks. As such, our novel neural network can better cope with image long-range dependencies which are crucially important for generalised image outpainting. We propose additionally a U-shaped structure and multi-view Temporal Spatial Predictor (TSP) module to reinforce image self-reconstruction as well as unknown-part prediction smoothly and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Advanced Image Processing Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Stochastic Depth · Dense Connections · Position-Wise Feed-Forward Layer · Adam · Label Smoothing · Absolute Position Encodings
