Self-Reasoning Agentic Framework for Narrative Product Grid-Collage Generation
Minyan Luo, Yuxin Zhang, Yifei Li, Xincan Wang, Fuzhang Wu, Tong-Yee Lee, Oliver Deussen, Weiming Dong

TL;DR
This paper introduces a self-reasoning agentic framework for generating narrative product grid collages that ensure visual consistency, storytelling coherence, and aesthetic harmony through explicit planning and iterative refinement.
Contribution
It presents a novel framework that constructs a product narrative, generates coordinated collages with shared style, and employs self-evaluation and refinement for improved quality.
Findings
Framework improves aesthetic quality over baselines
Enhances narrative richness and visual coherence
Iterative self-refinement leads to better results
Abstract
Narrative-driven product photography has become a prevalent paradigm in modern marketing, as coherent visual storytelling helps convey product value and establishes emotional engagement with consumers. However, existing image generation methods do not support structured narrative planning or cross-panel coordination, often resulting in weak storytelling and visual incoherence. In practice, narrative product photography is commonly presented as multi-grid collages, where multiple views or scenes jointly communicate a product narrative. To ensure visual consistency across grids and aesthetic harmony of the overall composition, we generate the collage as a single unified image rather than composing independently synthesized panels. We propose a self-reasoning agentic framework for narrative product grid collage generation. Given a product packshot and its name, the system first constructs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
