Personalized Image Generation from an Author Writing Style
Sagar Gandhi, Vishal Gandhi

TL;DR
This paper presents a novel pipeline that translates structured author writing profiles into personalized visual representations using large language models and diffusion techniques, validated through human evaluation.
Contribution
It introduces an end-to-end method for generating author style-specific images from structured literary summaries, combining LLMs and diffusion models.
Findings
High style match scores (mean 4.08/5) in evaluations.
Images effectively capture mood and atmosphere.
Challenges remain in representing abstract narrative elements.
Abstract
Translating nuanced, textually-defined authorial writing styles into compelling visual representations presents a novel challenge in generative AI. This paper introduces a pipeline that leverages Author Writing Sheets (AWS) - structured summaries of an author's literary characteristics - as input to a Large Language Model (LLM, Claude 3.7 Sonnet). The LLM interprets the AWS to generate three distinct, descriptive text-to-image prompts, which are then rendered by a diffusion model (Stable Diffusion 3.5 Medium). We evaluated our approach using 49 author styles from Reddit data, with human evaluators assessing the stylistic match and visual distinctiveness of the generated images. Results indicate a good perceived alignment between the generated visuals and the textual authorial profiles (mean style match: ), with images rated as moderately distinctive. Qualitative analysis further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
