Browsing by Author "Afifi, Salma, author"

Now showing 1 - 3 of 3

Open Access
GHOST: a graph neural network accelerator using silicon photonics
(Colorado State University. Libraries, 2023-09-09) Afifi, Salma, author; Sunny, Febin, author; Shafiee, Amin, author; Nikdast, Mahdi, author; Pasricha, Sudeep, author; ACM, publisher
Graph neural networks (GNNs) have emerged as a powerful approach for modelling and learning from graph-structured data. Multiple fields have since benefitted enormously from the capabilities of GNNs, such as recommendation systems, social network analysis, drug discovery, and robotics. However, accelerating and efficiently processing GNNs require a unique approach that goes beyond conventional artificial neural network accelerators, due to the substantial computational and memory requirements of GNNs. The slowdown of scaling in CMOS platforms also motivates a search for alternative implementation substrates. In this paper, we present GHOST, the first silicon-photonic hardware accelerator for GNNs. GHOST efficiently alleviates the costs associated with both vertex-centric and edge-centric operations. It implements separately the three main stages involved in running GNNs in the optical domain, allowing it to be used for the inference of various widely used GNN models and architectures, such as graph convolution networks and graph attention networks. Our simulation studies indicate that GHOST exhibits at least 10.2 × better throughput and 3.8 × better energy efficiency when compared to GPU, TPU, CPU and multiple state-of-the-art GNN hardware accelerators.
Open Access
Silicon photonic hardware accelerators for transformers and graph neural networks
(Colorado State University. Libraries, 2023) Afifi, Salma, author; Pasricha, Sudeep, advisor; Nikdast, Mahdi, committee member; Malaiya, Yashwant, committee member
The rapid growth of artificial intelligence (AI) applications has revolutionized the way we process data, make decisions, and interact with machines. Specifically, artificial neural networks (ANNs) have significantly evolved and now encompass various advanced neural networks such as transformers and graph neural networks (GNNs). This has enabled the development of innovative AI applications that can transform several industries, including healthcare, recommendation systems, and robotics. Transformer and transformer-based neural networks have outperformed multiple ANNs, such as convolution neural networks (CNNs) and recurrent neural networks (RNNs), across many natural language processing (NLP) tasks. Moreover, transformers are currently being integrated into vision tasks through using the vision transformer model (ViT). Similarly, GNNs have witnessed a surge of advancements over the past few years and have established their proficiency in dealing with graph-structured data. Nevertheless, each of these neural networks imposes unique challenges, hindering their inference and usage in resource-constrained systems. For instance, the transformer model's size, number of parameters, and complexity of operations lead to long inference times, large memory footprint, and low computation-to-memory ratio. On the other hand, GNNs inference challenges are due to their dense and very sparse computations. Additionally, the wide variety of possible input graphs structure and algorithms dictate the need for a system capable of efficiently adapting their execution and operations to the specific graph structure and effectively scaling to extremely large graphs. Accordingly, conventional computing processors and ANN accelerators are not tailored to cater for such challenges, and using them to accelerate transformers and GNN execution can be highly inefficient. ii Furthermore, the utilization of traditional electronic accelerators entails a number of limitations, including escalating fabrication costs due to low yields and diminishing performance improvements, associated with semiconductor-technology scaling. This has led researchers to start investigating other technologies for ANN acceleration such as silicon photonics which enables performing complex operations in the optical domain with low energy consumption and at very high throughput. While several hardware accelerators leveraging silicon photonics have been presented for networks such as CNNs, none have been customized for emerging complex neural networks such as transformers and GNNs. Due to the various challenges associated with each of these networks, designing reliable and efficient inference hardware accelerators for transformers and GNNs is a non-trivial problem. This thesis introduces two novel silicon-photonic-based hardware architectures for energy efficient and high throughput inference acceleration. As our first contribution, we propose a non-coherent silicon photonic hardware accelerator for transformer neural networks, called TRON. We demonstrate how TRON is able to accommodate a wide range of transformer and transformer-based neural networks while surpassing GPU, CPU, TPU, and several state-of-the-art transformer hardware accelerators. For GNN inference acceleration, we propose GHOST, a hardware accelerator that integrates various device-, circuit- and architecture-level optimizations which enable it to efficiently process a broad family of GNNs and real-world graph structures and sizes. When compared to multiple state-of-the-art GNN hardware accelerators, GPUs, CPUs, and TPUs, our experiments showcase how GHOST exhibits significantly better performance and energy efficiency.
Open Access
TRON: transformer neural network acceleration with non-coherent silicon photonics
(Colorado State University. Libraries, 2023-06-05) Afifi, Salma, author; Sunny Febin, author; Nikdast, Mahdi, author; Pasricha, Sudeep, author; ACM, publisher
Transformer neural networks are rapidly being integrated into state-of-the-art solutions for natural language processing (NLP) and computer vision. However, the complex structure of these models creates challenges for accelerating their execution on conventional electronic platforms. We propose the first silicon photonic hardware neural network accelerator called TRON for transformer-based models such as BERT, and Vision Transformers. Our analysis demonstrates that TRON exhibits at least 14× better throughput and 8× better energy efficiency, in comparison to state-of-the-art transformer accelerators.