Oracle Noise: Faster Semantic Spherical Alignment for Interpretable Latent Optimization
Haosen Li, Wenshuo Chen, Lei Wang, Shaofeng Liang, Haozhe Jia, Yutao Yue

TL;DR
Oracle Noise introduces a Riemannian hypersphere-based zero-shot framework that accelerates semantic alignment in text-to-image diffusion models, improving quality and efficiency without external parsers.
Contribution
It redefines noise initialization as a spherical, semantic-driven optimization, preserving Gaussian priors and enabling faster, artifact-free alignment in generative models.
Findings
Achieves state-of-the-art performance on human preference metrics.
Significantly accelerates semantic alignment within 2 seconds.
Eliminates Euclidean norm inflation, reducing visual artifacts.
Abstract
Text-to-image diffusion models have achieved remarkable generative capabilities, yet accurately aligning complex textual prompts with synthesized layouts remains an ongoing challenge. In these models, the initial Gaussian noise acts as a critical structural seed dictating the macroscopic layout. Recent online optimization and search methods attempt to refine this noise to enhance text-image alignment. However, relying on unconstrained Euclidean gradient ascent mathematically inflates the latent norm and destroys the standard Gaussian prior, causing severe visual artifacts like color over-saturation. Furthermore, these methods suffer from inefficient semantic routing and easily fall into the ``reward hacking'' trap of external proxy models. To address these intertwined bottlenecks, we propose Oracle Noise, a zero-shot framework reframing noise initialization as semantic-driven…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
