Mitigating Semantic Collapse in Generative Personalization with Test-Time Embedding Adjustment
Anh Bui, Trang Vu, Trung Le, Junae Kim, Tamas Abraham, Rollin Omari, Amar Kaur, Dinh Phung

TL;DR
This paper addresses the semantic collapse problem in generative personalization by proposing a training-free inference-time embedding adjustment method that preserves the original meaning of learned visual concepts.
Contribution
The authors introduce a novel, training-free approach to adjust embeddings at inference time, effectively mitigating semantic collapse in personalized generative models.
Findings
Significant improvement in text-image alignment across various personalization methods.
The proposed method is simple, training-free, and broadly applicable.
Effective in maintaining semantic richness and diversity in generated images.
Abstract
In this paper, we investigate the semantic collapsing problem in generative personalization, an under-explored topic where the learned visual concept () gradually shifts from its original textual meaning and comes to dominate other concepts in multi-concept input prompts. This issue not only reduces the semantic richness of complex input prompts like "a photo of wearing glasses and playing guitar" into simpler, less contextually rich forms such as "a photo of " but also leads to simplified output images that fail to capture the intended concept. We identify the root cause as unconstrained optimisation, which allows the learned embedding to drift arbitrarily in the embedding space, both in direction and magnitude. To address this, we propose a simple yet effective training-free method that adjusts the magnitude and direction of pre-trained embedding at inference time,…
Peer Reviews
Decision·ICLR 2026 Poster
The work proposes the Semantic Collapsing Problem (SCP) in generative personalization—an under-explored issue—and rigorously identifying unconstrained optimization as its root cause, with solid empirical evidence across textual and image spaces. The proposed TEA method is lightweight and practical: it requires no additional training, avoids modifying model weights, and generalizes well across diverse frameworks (e.g., Textual Inversion, DreamBooth) and architectures (Stable Diffusion, Flux), mak
TEA relies on fixed hyperparameters (α=0.2, β=1.5) across all prompts, which may not be optimal for diverse scenario.
1. High quality of writing and presentation. 2. The introduction of TEA, a method that is training-free and easily transferable. 3. An insightful analysis of the "Semantic Collapsing Problem" within generative personalization and the mechanics of anti-dreambooth methods.
1. The paper lacks comparisons to recent, mainstream works, particularly those in the in-context generation paradigm (e.g., OminiControl[1], FLUX.1 Kontext[2], and Diffusion Self-Distillation[3]). 2. The evaluation is insufficient. It should be strengthened by including MLLM-based benchmarks, such as DreamBench++[4]. 3. Table 1 shows a performance decrease in Reference and Image alignment scores. This is presumably because TEA's objective focuses exclusively on fidelity between the learned and o
The identification and empirical analysis of SCP are interesting. TEA is lightweight, requires no retraining, and is compatible with numerous existing frameworks. The paper is well-structured and easy to follow.
While the Test-time Embedding Adjustment (TEA) method is practical and easy to deploy, its technical contribution is relatively modest. The core mechanism—adjusting the magnitude and direction of an embedding vector—is a straightforward application of existing vector space operations, lacking the novelty of a more transformative technique. The approach does not introduce new learning paradigms or architectural innovations, but rather applies a post-hoc correction to the outputs of existing model
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTechnology Use by Older Adults
