EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation

Tianyu Wei; Shanmin Pang; Qi Guo; Yizhuo Ma; Xiaofeng Cao; Qing Guo

arXiv:2406.15863·cs.CV·October 31, 2025

EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation

Tianyu Wei, Shanmin Pang, Qi Guo, Yizhuo Ma, Xiaofeng Cao, Qing Guo

PDF

Open Access 3 Reviews

TL;DR

This paper introduces EmoAttack, a novel backdoor attack on text-to-image diffusion models that uses emotional input to generate malicious negative content, highlighting a new security risk in AI-generated imagery.

Contribution

The paper proposes EmoBooth, a method to embed emotional triggers into diffusion models for backdoor attacks without retraining, and provides a dataset and analysis validating its effectiveness.

Findings

01

EmoBooth successfully triggers malicious content with emotional inputs.

02

The attack method avoids extensive retraining of diffusion models.

03

Analysis confirms the feasibility and threat of emotion-aware backdoor attacks.

Abstract

Text-to-image diffusion models can generate realistic images based on textual inputs, enabling users to convey their opinions visually through language. Meanwhile, within language, emotion plays a crucial role in expressing personal opinions in our daily lives and the inclusion of maliciously negative content can lead users astray, exacerbating negative emotions. Recognizing the success of diffusion models and the significance of emotion, we investigate a previously overlooked risk associated with text-to-image diffusion models, that is, utilizing emotion in the input texts to introduce negative content and provoke unfavorable emotions in users. Specifically, we identify a new backdoor attack, i.e., emotion-aware backdoor attack (EmoAttack), which introduces malicious negative content triggered by emotional texts during image generation. We formulate such an attack as a diffusion…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 5Confidence 4

Strengths

* The method presented demonstrates a strategy to map more abstract concepts to targeted negative content without affecting the images generated using “normal” concepts, with the limitations of existing methods being presented well. Additonally, the decription of the attack methodology itself is very clear. * The ablation study provides good insight into the parameters within which the attack method is likely to perform as expected. * The inclusion of two types of attack scenarios provides a

Weaknesses

Despite the clarity of the methodological sections of the paper, the primary weaknesses of this work relate to the clarity of the subsequent presentation of the experimental procedure and evaluation. Details of the basic set-up of baselines are lacking sufficient detail in the main body of the paper. Additionally, the means of determining the exact values of the coefficients in the EAC metric are not sufficiently described in the Appendix. These values are particularly important when ranking met

Reviewer 02Rating 5Confidence 2

Strengths

1. This paper targets safety issues in the current text-to-image generation models. This research perspective is interesting and meaningful in practice. 2. It identifies some drawbacks of naive solutions by preliminary empirical study. Then it proposes a coherent framework to address these issues. 3. It conducts many experiments to evaluate the performance of the proposed method in the new task.

Weaknesses

1. Concerns about the problem formulation: * From my perception, this task is a certain type of controllable text-to-image generation, where the control signals are negative emotional textual words and the output should be a certain type of negative and malicious images. Therefore, I think it may not be proper to treat it as an “attack”, because when the text-to-image model generates violent images given the emotional textual prompt, it seems the model faithfully follows the textual instruct

Reviewer 03Rating 8Confidence 5

Strengths

1) Novel dataset prepared by considering different scenarios and attacks which really makes it an helpful real time dataset. 2) The analysis is really good. It covered all the points that reads has to know like making analysis on different situations and attacks. 3) Covering limitations of the other datasets in the paper really helps the readers to know different perspectives and challenge of the existing which really helps why this dataset is.

Weaknesses

1) I do not find the latest SOTA diffusion models being implemented like Dall-e, stable diffusion etc. 2) It would be great if more scenarios are covered instead of few. the images represents kind of violence it would be helpful if you have provided the other emotions also like discriminating, etc. I do not think they are present. Anyways it's a strong contribution.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization

MethodsDiffusion