Hardware-software codesign of silicon photonic AI accelerators

Sunny, Febin P., author; Pasricha, Sudeep, advisor; Nikdast, Mahdi, advisor; Chen, Haonan, committee member; Malaiya, Yashwant K., committee member

Hardware-software codesign of silicon photonic AI accelerators

dc.contributor.author	Sunny, Febin P., author
dc.contributor.author	Pasricha, Sudeep, advisor
dc.contributor.author	Nikdast, Mahdi, advisor
dc.contributor.author	Chen, Haonan, committee member
dc.contributor.author	Malaiya, Yashwant K., committee member
dc.date.accessioned	2024-09-09T20:52:02Z
dc.date.available	2024-09-09T20:52:02Z
dc.date.issued	2024
dc.description.abstract	Machine learning applications have become increasingly prevalent over the past decade across many real-world use cases, from smart consumer electronics to automotive, healthcare, cybersecurity, and language processing. This prevalence has been fueled by the emergence of powerful machine learning models, such as Deep Neural Networks (DNNs), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs). As researchers explore deeper models with higher connectivity, the computing power and the memory requirement necessary to train and utilize them also increase. Such increasing complexity also necessitates that the underlying hardware platform should consistently deliver better performance while satisfying strict power constraints. Unfortunately, the limited performance-per-watt in today's computing platforms – such as general-purpose CPUs, GPUs, and electronic neural network (NN) accelerators – creates significant challenges for the growth of new deep learning and AI applications. These electronic computing platforms face fundamental limits in the post-Moore Law era due to increased ohmic losses and capacitance-induced latencies in interconnects, as well as power inefficiencies and reliability concerns that reduce yields and increase costs with semiconductor-technology scaling. A solution to improving performance-per-watt for AI model processing is to explore more efficient hardware NN accelerator platforms. Silicon photonics has shown promise in terms of achievable energy efficiency and latency for data transfers. It is also possible to use photonic components to perform computation, e.g., matrix-vector multiplication. Such photonics-based AI accelerators can not only address the fan-in and fan-out problem with linear algebra processors, but their operational bandwidth can approach the photodetection rate (typically in the hundreds of GHz), which is orders of magnitude higher than electronic systems today that operate at a clock rate of a few GHz. A solution to the data-movement bottleneck can be the use of silicon photonics technology for photonic networks-on-chip (PNoCs), which can enable ultra-high bandwidth, low latency, and energy-efficient communication. However, to ensure reliable, efficient, and high throughput communication and computation using photonics, several challenges must be addressed first. Photonic computation is performed in the analog domain, which makes it susceptible to various noise sources and drives down the achievable resolution for representing NN model parameters. To increase the reliability of silicon photonic AI accelerators, fabrication-process variation (FPV), which is the change in physical dimensions and characteristics of devices due to imperfections in fabrication, must be addressed. FPVs induce resonant wavelength shifts that need to be compensated, for the microring resonators (MRs), which are the fundamental devices to realize photonic computation and communication in our proposed accelerator architectures, to operate correctly. Without this correction, FPVs will cause increased crosstalk and data corruption during photonic communication and can also lead to errors during photonic computation. Accordingly, the correction for FPVs is an essential part of reliable computation in silicon photonic-based AI accelerators. Even with FPV-resilient silicon photonic devices, the tuning latency incurred by thermo-optic (TO) tuning and the thermal crosstalk it can induce are significant. The latency, which can be in the microsecond range, impacts the overall throughput of the accelerator and the thermal crosstalk impacts its reliable operation. At the architectural level it is also necessary to ensure that the NN processing is done efficiently while making use of the photonic resources in terms of wavelengths, and NN model-aware decisions in terms of device deployment, arrangement, and multiply and accumulate (MAC) unit design have to be performed. To address these challenges, the major contributions of this thesis are focused on proposing a hardware-software co-design framework to enable high throughput, low latency, and energy-efficient AI acceleration across various neural network models, using silicon photonics. At the architectural level, we have proposed wavelength reuse schemes, vector decomposition, and NN-aware MAC unit designs for increased efficiency in laser power consumption. In terms of NN-aware designs, we have proposed layer-specific acceleration units, photonic batch normalization folding, and fine-grained sparse NN acceleration units. To tackle the reliability challenges introduced by FPV, we have performed device-level design-space exploration and optimization to design MRs that are more tolerant to FPVs than the state-of-the-art efforts in this area. We also adapt Thermal Eigen-mode decomposition and have devised various novel techniques to manage thermal and spectral crosstalk sources, allowing our silicon photonic-based AI accelerators to reach up to 16-bit parameter resolution per MR, which enables high accuracy for most NN models.
dc.format.medium	born digital
dc.format.medium	doctoral dissertations
dc.identifier	Sunny_colostate_0053A_18395.pdf
dc.identifier.uri	https://hdl.handle.net/10217/239207
dc.identifier.uri	https://doi.org/10.25675/3.03012
dc.language	English
dc.language.iso	eng
dc.publisher	Colorado State University. Libraries
dc.relation.ispartof	2020-
dc.rights	Copyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.
dc.subject	computer architecture
dc.subject	inference acceleration
dc.subject	silicon photonics
dc.subject	hardware-software codesign
dc.subject	artificial intelligence
dc.subject	machine learning
dc.title	Hardware-software codesign of silicon photonic AI accelerators
dc.type	Text
dcterms.rights.dpla	This Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
thesis.degree.discipline	Electrical and Computer Engineering
thesis.degree.grantor	Colorado State University
thesis.degree.level	Doctoral
thesis.degree.name	Doctor of Philosophy (Ph.D.)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Sunny_colostate_0053A_18395.pdf
Size:: 14.68 MB
Format:: Adobe Portable Document Format

Download

Collections

2020-
Theses and Dissertations