Latent Space Imaging

Matheus Souza; Yidan Zheng; Kaizhang Kang; Yogeshwar Nath Mishra,; Qiang Fu; Wolfgang Heidrich

arXiv:2407.07052·eess.IV·March 25, 2025·1 cites

Latent Space Imaging

Matheus Souza, Yidan Zheng, Kaizhang Kang, Yogeshwar Nath Mishra,, Qiang Fu, Wolfgang Heidrich

PDF

Open Access

TL;DR

Latent Space Imaging proposes a new method that encodes visual information directly into a generative model's latent space, significantly reducing hardware complexity and bandwidth for imaging and downstream tasks, inspired by the human visual system.

Contribution

This work introduces a hardware prototype that encodes images into a generative model's latent space, achieving high compression ratios and demonstrating a novel integration of optics and software for efficient imaging.

Findings

01

Achieved compression ratios from 1:100 to 1:1000 during imaging.

02

Up to 1:16384 compression for downstream applications.

03

Demonstrated hardware feasibility with a single-pixel camera.

Abstract

Digital imaging systems have traditionally relied on brute-force measurement and processing of pixels arranged on regular grids. In contrast, the human visual system performs significant data reduction from the large number of photoreceptors to the optic nerve, effectively encoding visual information into a low-bandwidth latent space representation optimized for brain processing. Inspired by this, we propose a similar approach to advance artificial vision systems. Latent Space Imaging introduces a new paradigm that combines optics and software to encode image information directly into the semantically rich latent space of a generative model. This approach substantially reduces bandwidth and memory demands during image capture and enables a range of downstream tasks focused on the latent space. We validate this principle through an initial hardware prototype based on a single-pixel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Imaging Techniques and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings