TL;DR
This paper introduces a novel, efficient attention layer called PSAL, based on stochastic patch matching, enabling high-resolution image editing tasks with reduced memory and computational costs.
Contribution
The paper presents a new stochastic attention layer using PatchMatch for efficient, scalable image editing, allowing end-to-end training and integration into various architectures.
Findings
PSAL reduces memory usage significantly compared to traditional attention.
PSAL maintains spatial precision and global context in high-resolution images.
PSAL improves performance on image inpainting, colorization, and super-resolution tasks.
Abstract
Attention mechanisms have become of crucial importance in deep learning in recent years. These non-local operations, which are similar to traditional patch-based methods in image processing, complement local convolutions. However, computing the full attention matrix is an expensive step with heavy memory and computational loads. These limitations curb network architectures and performances, in particular for the case of high resolution images. We propose an efficient attention layer based on the stochastic algorithm PatchMatch, which is used for determining approximate nearest neighbors. We refer to our proposed layer as a "Patch-based Stochastic Attention Layer" (PSAL). Furthermore, we propose different approaches, based on patch aggregation, to ensure the differentiability of PSAL, thus allowing end-to-end training of any network containing our layer. PSAL has a small memory footprint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsInpainting
