Generative Visual Communication in the Era of Vision-Language Models
Yael Vinker

TL;DR
This paper explores leveraging recent vision-language models to automate and improve visual communication design, addressing challenges in simplifying complex ideas and enhancing flexibility in visual outputs.
Contribution
It introduces methods to constrain vision-language models and incorporate task-specific regularizations for better visual communication design.
Findings
Enhanced abstraction and sketch generation capabilities
Improved visual storytelling and design flexibility
Potential for automating complex visual communication tasks
Abstract
Visual communication, dating back to prehistoric cave paintings, is the use of visual elements to convey ideas and information. In today's visually saturated world, effective design demands an understanding of graphic design principles, visual storytelling, human psychology, and the ability to distill complex information into clear visuals. This dissertation explores how recent advancements in vision-language models (VLMs) can be leveraged to automate the creation of effective visual communication designs. Although generative models have made great progress in generating images from text, they still struggle to simplify complex ideas into clear, abstract visuals and are constrained by pixel-based outputs, which lack flexibility for many design tasks. To address these challenges, we constrain the models' operational space and introduce task-specific regularizations. We explore various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage, Metaphor, and Cognition
