Coevolving Representations in Joint Image-Feature Diffusion

Theodoros Kouzelis; Spyros Gidaris; Nikos Komodakis

arXiv:2604.17492·cs.CV·April 21, 2026

Coevolving Representations in Joint Image-Feature Diffusion

Theodoros Kouzelis, Spyros Gidaris, Nikos Komodakis

PDF

1 Repo

TL;DR

This paper introduces CoReDi, a framework where semantic representations evolve during training to enhance diffusion-based image generation, leading to faster convergence and better quality.

Contribution

It proposes a novel coevolving representation method that adapts semantic spaces during training, improving diffusion models over fixed representations.

Findings

01

CoReDi achieves faster convergence in diffusion training.

02

It results in higher quality generated images.

03

Adaptive semantic spaces better complement image latents.

Abstract

Joint image-feature generative modeling has recently emerged as an effective strategy for improving diffusion training by coupling low-level VAE latents with high-level semantic features extracted from pre-trained visual encoders. However, existing approaches rely on a fixed representation space, constructed independently of the generative objective and kept unchanged during training. We argue that the representation space guiding diffusion should itself adapt to the generative task. To this end, we propose Coevolving Representation Diffusion (CoReDi), a framework in which the semantic representation space evolves during training by learning a lightweight linear projection jointly with the diffusion model. While naively optimizing this projection leads to degenerate solutions, we show that stable coevolution can be achieved through a combination of stop-gradient targets, normalization,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zelaki/CoReDi
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.