Light-Bound Transformers: Hardware-Anchored Robustness for Silicon-Photonic Computer Vision Systems
Xuming Chen, Deniz Najafi, Chengwei Zhou, Pietro Mercati, Arman Roohi, Mohsen Imani, Mahdi Nikdast, Shaahin Angizi, and Gourav Datta

TL;DR
This paper presents a hardware-aware training framework for silicon-photonic Vision Transformers that accounts for device noise and energy constraints, improving robustness and accuracy in photonic vision systems.
Contribution
It introduces a novel noise characterization and a Chance-Constrained Training method tailored for silicon-photonic accelerators, enabling robust ViT deployment under real hardware conditions.
Findings
Restores near-clean accuracy under realistic noise budgets.
Develops a noise-aware LayerNorm for stable feature statistics.
Creates a practical pipeline from measurement to training and deployment.
Abstract
Deploying Vision Transformers (ViTs) on near-sensor analog accelerators demands training pipelines that are explicitly aligned with device-level noise and energy constraints. We introduce a compact framework for silicon-photonic execution of ViTs that integrates measured hardware noise, robust attention training, and an energy-aware processing flow. We first characterize bank-level noise in microring-resonator (MR) arrays, including fabrication variation, thermal drift, and amplitude noise, and convert these measurements into closed-form, activation-dependent variance proxies for attention logits and feed-forward activations. Using these proxies, we develop Chance-Constrained Training (CCT), which enforces variance-normalized logit margins to bound attention rank flips, and a noise-aware LayerNorm that stabilizes feature statistics without changing the optical schedule. These components…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
