The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Weichen Fan; Haiwen Diao; Quan Wang; Dahua Lin; Ziwei Liu

arXiv:2512.19693·cs.CV·April 2, 2026

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Weichen Fan, Haiwen Diao, Quan Wang, Dahua Lin, Ziwei Liu

PDF

1 Repo 1 Models

TL;DR

This paper introduces the Prism Hypothesis and Unified Autoencoding (UAE), a model that harmonizes semantic and pixel representations by leveraging spectral characteristics, leading to improved image modeling performance.

Contribution

It uncovers a spectral correspondence between semantic and pixel encoders and proposes UAE to unify these representations through frequency-band modulation.

Findings

01

UAE achieves state-of-the-art performance in unifying semantic and pixel representations.

02

UAE improves FID and IS scores over baseline models.

03

Spectral analysis reveals semantic encoders focus on low-frequency components, while pixel encoders retain high-frequency details.

Abstract

Deep representations across modalities are inherently intertwined. In this paper, we systematically analyze the spectral characteristics of various semantic and pixel encoders. Interestingly, our study uncovers a highly inspiring and rarely explored correspondence between an encoder's feature spectrum and its functional role: semantic encoders primarily capture low-frequency components that encode abstract meaning, whereas pixel encoders additionally retain high-frequency information that conveys fine-grained detail. This heuristic finding offers a unifying perspective that ties encoder behavior to its underlying spectral structure. We define it as the Prism Hypothesis, where each data modality can be viewed as a projection of the natural world onto a shared feature spectrum, just like the prism. Building on this insight, we propose Unified Autoencoding (UAE), a model that harmonizes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

WeichenFan/UAE
github

Models

🤗
weepiess2383/UAE
model· ♡ 3
♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.