PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design

Jiazhe Wei; Ken Li; Tianyu Lao; Haofan Wang; Liang Wang; Caifeng Shan; Chenyang Si

arXiv:2512.04082·cs.CV·December 4, 2025

PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design

Jiazhe Wei, Ken Li, Tianyu Lao, Haofan Wang, Liang Wang, Caifeng Shan, Chenyang Si

PDF

Open Access

TL;DR

PosterCopilot is a novel framework that enhances layout reasoning and provides controllable, iterative editing capabilities for professional graphic design using advanced training strategies for multimodal models.

Contribution

It introduces a three-stage training strategy and a complete workflow enabling geometric understanding and layer-specific editing in graphic design LMMs.

Findings

01

Achieves geometrically accurate layouts

02

Provides aesthetically superior designs

03

Enables layer-controllable iterative editing

Abstract

Graphic design forms the cornerstone of modern visual communication, serving as a vital medium for promoting cultural and commercial events. Recent advances have explored automating this process using Large Multimodal Models (LMMs), yet existing methods often produce geometrically inaccurate layouts and lack the iterative, layer-specific editing required in professional workflows. To address these limitations, we present PosterCopilot, a framework that advances layout reasoning and controllable editing for professional graphic design. Specifically, we introduce a progressive three-stage training strategy that equips LMMs with geometric understanding and aesthetic reasoning for layout design, consisting of Perturbed Supervised Fine-Tuning, Reinforcement Learning for Visual-Reality Alignment, and Reinforcement Learning from Aesthetic Feedback. Furthermore, we develop a complete workflow…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInteractive and Immersive Displays · 3D Shape Modeling and Analysis · Data Visualization and Analytics