CogBlender: Towards Continuous Cognitive Intervention in Text-to-Image Generation
Shengqi Dang, Yi He, Jiaying Lei, Ziqing Qian, Nan Cao

TL;DR
CogBlender introduces a novel two-stage method for controlling cognitive properties in text-to-image generation, enabling smooth and continuous adjustment of psychological responses in generated images.
Contribution
It presents a new algorithm that allows continuous, multi-dimensional intervention on cognitive properties during image generation, bridging semantic content and psychological intent.
Findings
Effective control over valence, arousal, dominance, and memorability in generated images.
Smooth interpolation of cognitive states through flow-matching model velocity fields.
Demonstrated success across four cognitive properties in extensive experiments.
Abstract
Beyond conveying semantic information, images also possess cognitive properties that elicit specific psychological responses from viewers, such as memory encoding or emotional reactions. Although modern text-to-image (T2I) models generate semantically coherent content effectively, they struggle to control cognitive properties (e.g., valence, memorability) and often fail to align with the user's psychological intent. To bridge the gap, we introduce CogBlender, an algorithm that enables continuous and multi-dimensional intervention on cognitive properties through a novel two-stage approach. First, we construct discrete cognition-aware rewritten prompts-variants of the input prompt that represent distinct extreme cognitive states. Second, we translate these discrete prompts into continuous control signals by interpolating within the velocity-field domain of flow-matching models. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Data Visualization and Analytics
