Target-depth sensing with metasurface-encoder integrated optoelectronic neural network
Shuo Wang, Deyu Zhu, Chenjie Xiong, Bin Hu, Chunqi Jin, Yu Wang, and Chengjun Zou

TL;DR
This paper introduces a metasurface-encoder integrated optoelectronic neural network that compresses 3D target information into 2D images for efficient, real-time classification and depth estimation, reducing computational load.
Contribution
It presents a novel architecture combining metasurface encoding with lightweight neural networks for efficient 3D sensing and target tracking.
Findings
Achieved high accuracy in target classification and depth estimation.
Reduced network complexity and computational burden.
Validated on MNIST and Vehicle-Image datasets.
Abstract
Accurate and real-time sensing of targets in three-dimensional (3D) environments is essential for modern machine vision, underpinning emerging technologies such as autonomous systems, robotic manipulation, augmented reality, and intelligent surveillance. However, state-of-the-art 3D sensing approaches typically rely on complex postprocessing of multi-view images or LiDAR point clouds, resulting in considerable computational load, power consumption, and latency. To address these challenges, we propose a metasurface-encoder integrated optoelectronic neural network architecture that compresses 3D information into two-dimensional images by encoding depth using double-helix point spread function generated by a metasurface. The depth-encoded images are captured with a conventional monocular camera and subsequently processed by a lightweight shadow ResNet neural network. We experimentally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
