TL;DR
LayoutTransformer is a self-attention based framework that learns contextual relationships between layout elements to generate and extend diverse scene layouts across multiple domains, capturing semantic properties of primitives.
Contribution
It introduces LayoutTransformer, a novel self-attention model capable of generating and completing layouts in various domains with scalable primitive support.
Findings
Achieves competitive performance across diverse datasets.
Automatically captures semantic properties of primitives.
Supports layout generation from empty or seed sets.
Abstract
We address the problem of scene layout generation for diverse domains such as images, mobile applications, documents, and 3D objects. Most complex scenes, natural or human-designed, can be expressed as a meaningful arrangement of simpler compositional graphical primitives. Generating a new layout or extending an existing layout requires understanding the relationships between these primitives. To do this, we propose LayoutTransformer, a novel framework that leverages self-attention to learn contextual relationships between layout elements and generate novel layouts in a given domain. Our framework allows us to generate a new layout either from an empty set or from an initial seed set of primitives, and can easily scale to support an arbitrary of primitives per layout. Furthermore, our analyses show that the model is able to automatically capture the semantic properties of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
