Outpainting by Queries
Kai Yao, Penglei Gao, Xi Yang, Kaizhu Huang, Jie Sun, and Rui Zhang

TL;DR
This paper introduces QueryOTR, a transformer-based framework for image outpainting that leverages query-based autoregression, improving global extrapolation and seamlessness over CNN-based methods.
Contribution
It proposes a novel hybrid vision-transformer encoder-decoder architecture with query expansion and patch smoothing modules for improved image outpainting.
Findings
Outperforms state-of-the-art CNN-based outpainting methods.
Generates visually appealing and seamless extrapolated images.
Accelerates convergence with the proposed modules.
Abstract
Image outpainting, which is well studied with Convolution Neural Network (CNN) based framework, has recently drawn more attention in computer vision. However, CNNs rely on inherent inductive biases to achieve effective sample learning, which may degrade the performance ceiling. In this paper, motivated by the flexible self-attention mechanism with minimal inductive biases in transformer architecture, we reframe the generalised image outpainting problem as a patch-wise sequence-to-sequence autoregression problem, enabling query-based image outpainting. Specifically, we propose a novel hybrid vision-transformer-based encoder-decoder framework, named \textbf{Query} \textbf{O}utpainting \textbf{TR}ansformer (\textbf{QueryOTR}), for extrapolating visual context all-side around a given image. Patch-wise mode's global modeling capacity allows us to extrapolate images from the attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
MethodsConvolution
