Model-free Optical Processors using In Situ Reinforcement Learning with Proximal Policy Optimization

Yuhang Li; Shiqi Chen; Tingyu Gong; Aydogan Ozcan

arXiv:2507.05583·cs.LG·January 5, 2026

Model-free Optical Processors using In Situ Reinforcement Learning with Proximal Policy Optimization

Yuhang Li, Shiqi Chen, Tingyu Gong, Aydogan Ozcan

PDF

TL;DR

This paper presents a model-free reinforcement learning method using Proximal Policy Optimization for in situ training of diffractive optical processors, improving convergence speed and stability without prior system modeling.

Contribution

It introduces a novel PPO-based in situ training approach for optical processors, addressing limitations of existing methods and handling real-world imperfections effectively.

Findings

01

Demonstrated improved convergence in optical tasks

02

Achieved stable training despite hardware imperfections

03

Validated across multiple optical applications

Abstract

Optical computing holds promise for high-speed, energy-efficient information processing, with diffractive optical networks emerging as a flexible platform for implementing task-specific transformations. A challenge, however, is the effective optimization and alignment of the diffractive layers, which is hindered by the difficulty of accurately modeling physical systems with their inherent hardware imperfections, noise, and misalignments. While existing in situ optimization methods offer the advantage of direct training on the physical system without explicit system modeling, they are often limited by slow convergence and unstable performance due to inefficient use of limited measurement data. Here, we introduce a model-free reinforcement learning approach utilizing Proximal Policy Optimization (PPO) for the in situ training of diffractive optical processors. PPO efficiently reuses in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsProximal Policy Optimization