EmoCtrl: Controllable Emotional Image Content Generation
Jingyuan Yang, Weibin Luo, Hui Huang

TL;DR
EmoCtrl is a novel image generation model that produces content-faithful images with controllable emotional expression, outperforming existing methods in aligning with human preferences.
Contribution
The paper introduces EmoCtrl, a new model with a dedicated dataset and modules for emotion-aware image generation, bridging content fidelity and emotional expressiveness.
Findings
EmoCtrl achieves better content fidelity and emotional control than existing models.
User studies show EmoCtrl aligns closely with human emotional preferences.
EmoCtrl generalizes well to creative applications, demonstrating robustness.
Abstract
An image conveys meaning through both its visual content and emotional tone, jointly shaping human perception. We introduce Controllable Emotional Image Content Generation (C-EICG), which aims to generate images that remain faithful to a given content description while expressing a target emotion. Existing text-to-image models ensure content consistency but lack emotional awareness, whereas emotion-driven models generate affective results at the cost of content distortion. To address this gap, we propose EmoCtrl, supported by a dataset annotated with content, emotion, and affective prompts, bridging abstract emotions to visual cues. EmoCtrl incorporates textual and visual emotion enhancement modules that enrich affective expression via descriptive semantics and perceptual cues. To align with human preference, we further introduce an emotion-driven preference optimization with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
