Human-Aligned Generative Perception: Bridging Psychophysics and Generative Models

Antara Titikhsha; Om Kulkarni; Dharun Muthaiah

arXiv:2512.22272·cs.CV·December 30, 2025

Human-Aligned Generative Perception: Bridging Psychophysics and Generative Models

Antara Titikhsha, Om Kulkarni, Dharun Muthaiah

PDF

Open Access

TL;DR

This paper introduces a lightweight human perception model to guide generative image models, enhancing geometric accuracy and semantic alignment without additional training.

Contribution

It proposes a Human Perception Embedding teacher that guides diffusion models to better incorporate geometric constraints and human-like shape understanding.

Findings

01

Improved geometric control in image generation.

02

Achieved 80% better semantic alignment.

03

Enabled zero-shot transfer of complex shapes.

Abstract

Text-to-image diffusion models generate highly detailed textures, yet they often rely on surface appearance and fail to follow strict geometric constraints, particularly when those constraints conflict with the style implied by the text prompt. This reflects a broader semantic gap between human perception and current generative models. We investigate whether geometric understanding can be introduced without specialized training by using lightweight, off-the-shelf discriminators as external guidance signals. We propose a Human Perception Embedding (HPE) teacher trained on the THINGS triplet dataset, which captures human sensitivity to object shape. By injecting gradients from this teacher into the latent diffusion process, we show that geometry and style can be separated in a controllable manner. We evaluate this approach across three architectures: Stable Diffusion v1.5 with a U-Net…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Aesthetic Perception and Analysis