Afifi, Salma, authorAlo, Oluwaseun, authorThakkar, Ishan, authorPasricha, Sudeep, authorACM, publisher2025-12-222025-12-222025-09-07Salma Afifi, Oluwaseun Alo, Ishan Thakkar, and Sudeep Pasricha. 2025. ASTRA: A Stochastic Transformer Neural Network Accelerator with Silicon Photonics. ACM Trans. Embed. Comput. Syst. Just Accepted (September 2025). https://doi.org/10.1145/3769092https://hdl.handle.net/10217/242561Transformers have emerged as a dominant architecture in deep learning, demonstrating unparalleled success across a wide range of applications, including natural language processing (NLP), computer vision (CV), and scientific computing. By leveraging the self-attention mechanism, transformers achieve superior performance over traditional models such as recurrent neural networks (RNNs) and convolutional neural networks (CNNs). However, these performance gains come at a cost—high computational complexity and substantial memory requirements, making transformers particularly challenging to deploy efficiently on conventional hardware. To address the increasingly intensive computational demands of attention-based transformers, there is growing interest in developing efficient and high-speed hardware accelerators. Silicon photonics has emerged as a promising alternative to digital electronics, offering high-bandwidth and low-latency computation while improving overall computational and energy efficiency. This work introduces ASTRA, the first optical hardware accelerator that leverages stochastic computing principles for transformer neural networks. ASTRA incorporates novel full-range optical stochastic multipliers and stochastic-analog compute-capable optical-to-electrical transducer units to efficiently handle both static and dynamic tensor computations in attention-based models. Through detailed performance analysis, we demonstrate that ASTRA achieves at least 7.6 x speedup and 1.3 x lower energy consumption compared to state-of-the-art transformer accelerators.born digitalarticleseng©Salma Afifi, et al. ACM 2025. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Trans. Embed. Comput. Syst., 2025. https://doi.org/10.1145/3769092.transformer neural networkssilicon photonicsinference accelerationstochastic computingoptical computingASTRA: a stochastic transformer neural network accelerator with silicon photonicsTexthttps://doi.org/10.1145/3769092