Sustainable Transformer Neural Network Acceleration with Stochastic Photonic Computing
S. Afifi, O. Alo, I. Thakkar, S. Pasricha

TL;DR
ASTRA is a novel silicon-photonic accelerator that uses stochastic computing to significantly improve the speed and energy efficiency of transformer models.
Contribution
It introduces the first optical stochastic computing-based accelerator for transformers, combining innovative optical multipliers and homodyne accumulation.
Findings
Achieves at least 7.6x speedup over existing accelerators.
Reduces energy overheads by 1.3x compared to state-of-the-art.
Demonstrates potential for scalable and sustainable transformer inference.
Abstract
Transformers achieve state-of-the-art performance in natural language processing, vision, and scientific computing, but demand high computation and memory. To address these challenges, we present ASTRA, the first silicon-photonic accelerator leveraging stochastic computing for transformers. ASTRA employs novel optical stochastic multipliers and unary/analog homodyne accumulation in a crosstalk-minimal organization to efficiently process dynamic tensor computations. Evaluations show at least 7.6x speedup and 1.3x lower energy overheads compared to state-of-the-art accelerators, highlighting ASTRA's potential for efficient, scalable, and sustainable transformer inference.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
