MooD: Perception-Enhanced Efficient Affective Image Editing via Continuous Valence-Arousal Modeling
Xinyi Yin, Yiduo Wang, Tingqi Hu, Meicong Si, Yunyun Shi, Shi Chen, Hao Wang, Junxiao Xue, Xuecheng Wu

TL;DR
MooD is a novel framework for affective image editing that uses continuous Valence-Arousal modeling to enable fine-grained, efficient, and controllable modifications across diverse scenarios.
Contribution
It introduces a VA-aware retrieval strategy, visual transfer, and semantic guidance, along with a new VA-annotated dataset, AffectSet, for improved emotion-driven image editing.
Findings
MooD outperforms existing methods in affective controllability and visual fidelity.
The framework maintains high efficiency in image editing tasks.
Ablation studies highlight key factors influencing performance.
Abstract
Affective Image Editing (AIE) aims to modify visual content to evoke targeted emotions. Although current approaches achieve impressive editing quality, they often overlook inference efficiency, which limits their applicability in computational social scenarios. Moreover, most methods depend on discrete emotion representations, which hinder the continuous modeling of complex human emotions and constrain expressive capabilities in interactive scenarios. To tackle these gaps, we propose MooD, the first framework that directly leverages continuous Valence-Arousal (VA) values as editing instruction for fine-grained and efficient AIE in computational social systems. Specifically, we first introduce a VA-Aware retrieval strategy to bridge vague affective values and detailed visual semantics. Building upon this, MooD integrates visual transfer and perception-enhanced semantic guidance to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
