Model-free Optical Processors using In Situ Reinforcement Learning with Proximal Policy Optimization
Yuhang Li, Shiqi Chen, Tingyu Gong, Aydogan Ozcan

TL;DR
This paper presents a model-free reinforcement learning method using Proximal Policy Optimization for in situ training of diffractive optical processors, improving convergence speed and stability without prior system modeling.
Contribution
It introduces a novel PPO-based in situ training approach for optical processors, addressing limitations of existing methods and handling real-world imperfections effectively.
Findings
Demonstrated improved convergence in optical tasks
Achieved stable training despite hardware imperfections
Validated across multiple optical applications
Abstract
Optical computing holds promise for high-speed, energy-efficient information processing, with diffractive optical networks emerging as a flexible platform for implementing task-specific transformations. A challenge, however, is the effective optimization and alignment of the diffractive layers, which is hindered by the difficulty of accurately modeling physical systems with their inherent hardware imperfections, noise, and misalignments. While existing in situ optimization methods offer the advantage of direct training on the physical system without explicit system modeling, they are often limited by slow convergence and unstable performance due to inefficient use of limited measurement data. Here, we introduce a model-free reinforcement learning approach utilizing Proximal Policy Optimization (PPO) for the in situ training of diffractive optical processors. PPO efficiently reuses in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsProximal Policy Optimization
