Streamlined optical training of large-scale modern deep learning architectures with direct feedback alignment
Ziao Wang, Kilian M\"uller, Matthew Filipovich, Julien Launay, Ruben, Ohana, Gustave Pariente, Safa Mokaadi, Charles Brossollet, Fabien Moreau,, Alessandro Cappelli, Iacopo Poli, Igor Carron, Laurent Daudet, Florent, Krzakala, Sylvain Gigan

TL;DR
This paper demonstrates a scalable hybrid optical-electronic training method for large deep neural networks, including Transformers, using direct feedback alignment on a photonic platform, enabling faster and energy-efficient training.
Contribution
It introduces a novel hybrid opto-electronic training approach with optical processing units performing large-scale matrix multiplications for deep learning.
Findings
Optical training achieved speeds up to 1500 TeraOPS with 30 Watts.
Successfully trained Transformers with over 1 billion parameters.
Potential for scaling to ultra-deep neural networks with energy efficiency.
Abstract
Modern deep learning relies nearly exclusively on dedicated electronic hardware accelerators. Photonic approaches, with low consumption and high operation speed, are increasingly considered for inference but, to date, remain mostly limited to relatively basic tasks. Simultaneously, the problem of training deep and complex neural networks, overwhelmingly performed through backpropagation, remains a significant limitation to the size and, consequently, the performance of current architectures and a major compute and energy bottleneck. Here, we experimentally implement a versatile and scalable training algorithm, called direct feedback alignment, on a hybrid electronic-photonic platform. An optical processing unit performs large-scale random matrix multiplications, which is the central operation of this algorithm, at speeds up to 1500 TeraOPS under 30 Watts of power. We perform optical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing
