Diverse Multimedia Layout Generation with Multi Choice Learning

David D. Nguyen; Surya Nepal; Salil S. Kanhere

arXiv:2301.06629·cs.CV·January 18, 2023

Diverse Multimedia Layout Generation with Multi Choice Learning

David D. Nguyen, Surya Nepal, Salil S. Kanhere

PDF

TL;DR

This paper introduces LayoutMCL, a neural network model that generates diverse multimedia layouts by predicting multiple options simultaneously, addressing the limitations of single-choice models and improving layout diversity and quality.

Contribution

The paper proposes LayoutMCL, a multi-choice learning framework with winner-takes-all loss, enabling stable, diverse layout generation unlike existing single-prediction models.

Findings

01

Reduces FID by 83-98% on real data benchmarks.

02

Generates significantly more diverse layouts than existing methods.

03

Effectively models multiple acceptable layout options.

Abstract

Designing visually appealing layouts for multimedia documents containing text, graphs and images requires a form of creative intelligence. Modelling the generation of layouts has recently gained attention due to its importance in aesthetics and communication style. In contrast to standard prediction tasks, there are a range of acceptable layouts which depend on user preferences. For example, a poster designer may prefer logos on the top-left while another prefers logos on the bottom-right. Both are correct choices yet existing machine learning models treat layouts as a single choice prediction problem. In such situations, these models would simply average over all possible choices given the same input forming a degenerate sample. In the above example, this would form an unacceptable layout with a logo in the centre. In this paper, we present an auto-regressive neural network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.