DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation

Songhua Liu; Jingwen Ye; Sucheng Ren; Xinchao Wang

arXiv:2207.06124·cs.CV·March 28, 2023·1 cites

DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation

Songhua Liu, Jingwen Ye, Sucheng Ren, Xinchao Wang

PDF

Open Access 1 Repo

TL;DR

DynaST introduces a dynamic sparse attention Transformer that achieves fine-grained, efficient exemplar-guided image generation by adaptively focusing on relevant tokens, outperforming existing methods in detail quality and computational cost.

Contribution

The paper proposes a novel dynamic attention mechanism within a Transformer for flexible, fine-level matching in exemplar-guided image generation, applicable to both supervised and unsupervised tasks.

Findings

01

Outperforms state-of-the-art in local detail preservation

02

Reduces computational cost significantly

03

Effective across multiple image translation applications

Abstract

One key challenge of exemplar-guided image generation lies in establishing fine-grained correspondences between input and guided images. Prior approaches, despite the promising results, have relied on either estimating dense attention to compute per-point matching, which is limited to only coarse scales due to the quadratic memory cost, or fixing the number of correspondences to achieve linear complexity, which lacks flexibility. In this paper, we propose a dynamic sparse attention based Transformer model, termed Dynamic Sparse Transformer (DynaST), to achieve fine-level matching with favorable efficiency. The heart of our approach is a novel dynamic-attention unit, dedicated to covering the variation on the optimal number of tokens one position should focus on. Specifically, DynaST leverages the multi-layer nature of Transformer structure, and performs the dynamic attention scheme in a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huage001/dynast
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Human Pose and Action Recognition

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Byte Pair Encoding · Attention Dropout · Position-Wise Feed-Forward Layer · Adam · Residual Connection