Streamlined optical training of large-scale modern deep learning   architectures with direct feedback alignment

Ziao Wang; Kilian M\"uller; Matthew Filipovich; Julien Launay; Ruben; Ohana; Gustave Pariente; Safa Mokaadi; Charles Brossollet; Fabien Moreau,; Alessandro Cappelli; Iacopo Poli; Igor Carron; Laurent Daudet; Florent; Krzakala; Sylvain Gigan

arXiv:2409.12965·cs.ET·April 3, 2025·2 cites

Streamlined optical training of large-scale modern deep learning architectures with direct feedback alignment

Ziao Wang, Kilian M\"uller, Matthew Filipovich, Julien Launay, Ruben, Ohana, Gustave Pariente, Safa Mokaadi, Charles Brossollet, Fabien Moreau,, Alessandro Cappelli, Iacopo Poli, Igor Carron, Laurent Daudet, Florent, Krzakala, Sylvain Gigan

PDF

Open Access

TL;DR

This paper demonstrates a scalable hybrid optical-electronic training method for large deep neural networks, including Transformers, using direct feedback alignment on a photonic platform, enabling faster and energy-efficient training.

Contribution

It introduces a novel hybrid opto-electronic training approach with optical processing units performing large-scale matrix multiplications for deep learning.

Findings

01

Optical training achieved speeds up to 1500 TeraOPS with 30 Watts.

02

Successfully trained Transformers with over 1 billion parameters.

03

Potential for scaling to ultra-deep neural networks with energy efficiency.

Abstract

Modern deep learning relies nearly exclusively on dedicated electronic hardware accelerators. Photonic approaches, with low consumption and high operation speed, are increasingly considered for inference but, to date, remain mostly limited to relatively basic tasks. Simultaneously, the problem of training deep and complex neural networks, overwhelmingly performed through backpropagation, remains a significant limitation to the size and, consequently, the performance of current architectures and a major compute and energy bottleneck. Here, we experimentally implement a versatile and scalable training algorithm, called direct feedback alignment, on a hybrid electronic-photonic platform. An optical processing unit performs large-scale random matrix multiplications, which is the central operation of this algorithm, at speeds up to 1500 TeraOPS under 30 Watts of power. We perform optical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Reservoir Computing