Browsing by Author "Nikdast, Mahdi, advisor"
Now showing 1 - 5 of 5
- Results Per Page
- Sort Options
Item Open Access Design and optimization of efficient, fault-tolerant and secure 2.5D chiplet systems(Colorado State University. Libraries, 2024) Taheri, Ebad, author; Nikdast, Mahdi, advisor; Pasricha, Sudeep, advisor; Malaiya, Yashwant K., committee member; Jayasumana, Anura P., committee memberIn response to the burgeoning demand for high-performance computing systems, this Ph.D. dissertation investigates the pivotal challenges surrounding Networks-on-Chip (NoCs) within the framework of 2.5D and 3D integration technologies, with a primary objective of enhancing the efficiency, fault tolerance, and security of forthcoming computing system architectures. The inherent limitations in bandwidth and reliability at the boundary of chiplets in 2.5D chiplet systems engender significant challenges in traffic management, latency, and energy efficiency. Furthermore, the interconnected global network on an interposer, linking multiple chiplets, necessitates high-bandwidth, low-latency communication to accommodate the substantial traffic generated by numerous cores across diverse chiplets. This Ph.D. dissertation emphasizes various design aspects of NoCs, such as latency, energy efficiency, fault tolerance, and security. It explores the design of 3D NoCs leveraging Through-Silicon Vias (TSVs) for vertical communication. To address reliability concerns and fabrication costs associated with high TSV density, Partially Connected 3D NoC (PC-3DNoC) has been proposed. An adaptive congestion-aware TSV link selection algorithm is introduced to manage traffic load and optimize communication, resulting in reduced latency and improved energy efficiency. For 2.5D chiplet systems, a novel deadlock-free and fault-tolerant routing algorithm is presented. The fault-tolerant algorithm enhances redundancy in vertical link selection and offers improved network reachability with reduced latency compared to existing solutions, even in the presence of faults. Furthermore, to address the energy consumption concerns of silicon-photonic-based 2.5D networks, a reconfigurable power-efficient and congestion-aware silicon-photonic-based 2.5D Interposer network is proposed. The proposed photonic interposer utilizes phase change materials (PCMs) for dynamic reconfiguration and power gating of the photonic network, leading to lower latency and improved energy efficiency. Additionally, the research investigates the integration of optical computation and communication into 2.5D chiplet platforms for domain-specific machine learning (ML) processing. This approach aims to overcome limitations in computation density and communication speeds faced by traditional accelerators, paving the way for sustainable and scalable ML hardware. Furthermore, this dissertation proposes a 2.5D chiplet-based architecture utilizing a silicon-photonic-based interposer, which tackles the limitations of conventional bus-based communication by employing a novel switch-based network, achieving significant energy efficiency improvements for high-bandwidth, low-latency data movement in machine learning accelerators. The switch-based network employs our proposed optical switch based on Mach--Zehnder Interferometer (MZI) devices with a dividing state to facilitate broadcast and optimize communication for ML workloads. Finally, the dissertation explores security considerations in 2.5D chiplet systems with diverse, potentially untrusted chiplets. To address this, a secure routing framework for Network-on-Interposer is presented. The proposed secure framework protects the system against distributed denial-of-service (DDoS) attacks by concealing predictable routing paths. It leverages multi-objective optimization to balance efficiency and reliability for the NoI. The proposed contributions in this dissertation help advance the field of chip-scale interconnection networks by proposing novel techniques for improved performance, reliability, and power efficiency in 3D and 2.5D NoC architectures. These advancements hold promise for the design of future high-performance computing systems, particularly in the areas of machine learning and other computationally intensive applications.Item Open Access Design exploration and optimization of silicon photonic integrated circuits under fabrication-process variations(Colorado State University. Libraries, 2024) Mirza, Asif Anwar Baig, author; Nikdast, Mahdi, advisor; Pasricha, Sudeep, advisor; Wilson, Jesse, committee member; Brewer, Samuel, committee memberSilicon photonic integrated circuits (PICs) have become a key solution to handle the growing demands of large data transmission in emerging applications by consuming less power and low heat dissipation while offering ultra-high data bandwidth than electronic circuits. With Moore's Law slowing down and the end of Dennard scaling, PICs offer a logical step to improve data movement and processing performance in future computing systems. On PICs, light is processed and routed by means of optical waveguides. Silicon has a unique feature of high refractive index contrast in the silicon-on-insulator (SOI) platform which allows for tight confinement of light in nanometer waveguide cores and bends with a radius of only a few microns. PICs comprise of a diverse set of elements such as waveguide splitters, combiners, crossings, and couplers which help with distribution, routing, and computation of optical signals. Optical signals are converted to electrical signals with the help of photodiodes which in silicon photonics are implemented using Germanium. To enable PICs for wavelength-division multiplexing (WDM), there is a need for efficient wavelength filters consisting of optical delay lines or resonators. Optical delay lines are usually built using Mach-Zehnder Interferometers (MZIs) which consists of a splitter, two waveguides with a given group delay, and a combiner. Other devices such as microring resonators (MRRs) can be used as wavelength filters when the input wavelength matches a whole multiple times in the circumference of the ring. Other components such as grating coupler help couple the light into and out of a PIC. PICs can be fabricated on the infrastructure developed for complimentary metal–oxide–semiconductor (CMOS) electronics. This technology now enables deep submicron features with unprecedented accuracy in large volumes along with close integration of photonics and electronic circuits. The use of silicon as a base material makes reuse of these manufacturing tools possible, but photonics imposes different demands on the processes. Although silicon photonics offers data transmission and computation at light speed with high bandwidth and low power consumption, the fundamental building blocks in PICs (e.g., optical waveguides) are extremely sensitive to nanometer-scale fabrication-process variations (FPVs) caused due to slight randomness in optical lithography processes. Active compensation by means of electronic circuits (a.k.a. tuning) is necessary to compensate for FPVs. Tunable microheaters can be used for active compensation which affect the material properties of silicon to improve PIC's performance under FPVs. However, the total power consumed due to tuning in a working PIC can be drastically high. For example, variations as small as 1 nm in an MRR can deviate the optical frequency response of the device by 2 nm that leads to approximately 25% increase in the tuning power consumption to compensate for variations of a single MRR. Additionally, a system can have thousands of such MRRs that can easily add up the total power consumption of the system. In order to address FPVs we need to observe the reliability not just at a system level but down to the device level by enabling reliable, FPV-aware devices to enable FPV-resilient PICs and photonic systems. Designing more reliable and FPV-tolerant photonic devices should not only help us with reducing the total power consumption but also build more reliable circuits with fault-free operational behavior for data transmission and computation in future computing systems. This PhD thesis covers the impact of process variations on photonic devices primarily MRRs. We take a bottom-up approach in improving the reliability of an MRR towards FPVs. We propose an improved and optimized MRR designs which can be used in any PIC to reduce the overall shift in resonant wavelength of the device due to FPVs, further reducing the total power consumption required to tune the device. We confirmed our findings by further fabricating such MRRs and comparing the improved and optimized designs against conventional MRRs. Furthermore, we study the impact these improved MRRs have in photonic artificial intelligence (AI) accelerators and how they can further improve the network accuracy and overall power consumption. Finally, we also compile our work into a device exploration tool that allows photonic designer to set design parameters in an MRR and study its behavior under different FPV profiles. With this tool we aim to give the designer the ability to determine desired MRR designs based on desired design and performance requirements and budget constraints set on a photonic system.Item Open Access Engineering a silicon- photonic bimodal biosensor(Colorado State University. Libraries, 2024) Mohammad, Ahmed, author; Nikdast, Mahdi, advisor; Lear, Kevin, advisor; Kipper, Matthew, committee memberBiosensors are powerful analytical devices that integrate biological sensing elements with physicochemical transducers to detect and quantify specific analytes, offering wide-ranging applications in fields such as medical diagnostics, environmental monitoring, food safety, and drug discovery. Bimodal waveguide (BiMW) biosensors, an interferometric optical biosensor, proven to be one of the best optical biosensors based on their high sensitivity, real time detection and compact design. During its early development stages, early 2010's, the height of the bimodal waveguide was increased to induce interference between the fundamental and first-order modes. Later, in late 2010's, change in the width of the bimodal waveguide were introduced to induce this interference. Our novel design builds upon these advancements, focusing on optimizing some parameters, mainly the width of the bimodal biosensor, to enhance performance and sensitivity. Many attempts were simulated to get a high fringe visibility and to determine the reduction in the transmission monitor was due to reduce the input power or the change in the effective index in the sensing region. Then, we came out with a design with one input, to maximize the fringe visibility, and two output, to determine the source power fluctuation. Multiple changes in the parameters, such as the width and the offset of the input waveguide, were investigated. In addition, change in the width of the bimodal waveguide was also included in this experiment. Finally, we varied the gap between the two output bends. All these parameters were varied to get a higher fringe visibility and lead to better sensitivity. Moreover, we discovered that this design requires the sample to be placed on top of the bimodal waveguide, rather than on the sides. We concluded that the best design we can extract is the one with 120 rad/RIU cm.Item Open Access Hardware-software codesign of silicon photonic AI accelerators(Colorado State University. Libraries, 2024) Sunny, Febin P., author; Pasricha, Sudeep, advisor; Nikdast, Mahdi, advisor; Chen, Haonen, committee member; Malaiya, Yashwant K., committee memberMachine learning applications have become increasingly prevalent over the past decade across many real-world use cases, from smart consumer electronics to automotive, healthcare, cybersecurity, and language processing. This prevalence has been fueled by the emergence of powerful machine learning models, such as Deep Neural Networks (DNNs), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs). As researchers explore deeper models with higher connectivity, the computing power and the memory requirement necessary to train and utilize them also increase. Such increasing complexity also necessitates that the underlying hardware platform should consistently deliver better performance while satisfying strict power constraints. Unfortunately, the limited performance-per-watt in today's computing platforms – such as general-purpose CPUs, GPUs, and electronic neural network (NN) accelerators – creates significant challenges for the growth of new deep learning and AI applications. These electronic computing platforms face fundamental limits in the post-Moore Law era due to increased ohmic losses and capacitance-induced latencies in interconnects, as well as power inefficiencies and reliability concerns that reduce yields and increase costs with semiconductor-technology scaling. A solution to improving performance-per-watt for AI model processing is to explore more efficient hardware NN accelerator platforms. Silicon photonics has shown promise in terms of achievable energy efficiency and latency for data transfers. It is also possible to use photonic components to perform computation, e.g., matrix-vector multiplication. Such photonics-based AI accelerators can not only address the fan-in and fan-out problem with linear algebra processors, but their operational bandwidth can approach the photodetection rate (typically in the hundreds of GHz), which is orders of magnitude higher than electronic systems today that operate at a clock rate of a few GHz. A solution to the data-movement bottleneck can be the use of silicon photonics technology for photonic networks-on-chip (PNoCs), which can enable ultra-high bandwidth, low latency, and energy-efficient communication. However, to ensure reliable, efficient, and high throughput communication and computation using photonics, several challenges must be addressed first. Photonic computation is performed in the analog domain, which makes it susceptible to various noise sources and drives down the achievable resolution for representing NN model parameters. To increase the reliability of silicon photonic AI accelerators, fabrication-process variation (FPV), which is the change in physical dimensions and characteristics of devices due to imperfections in fabrication, must be addressed. FPVs induce resonant wavelength shifts that need to be compensated, for the microring resonators (MRs), which are the fundamental devices to realize photonic computation and communication in our proposed accelerator architectures, to operate correctly. Without this correction, FPVs will cause increased crosstalk and data corruption during photonic communication and can also lead to errors during photonic computation. Accordingly, the correction for FPVs is an essential part of reliable computation in silicon photonic-based AI accelerators. Even with FPV-resilient silicon photonic devices, the tuning latency incurred by thermo-optic (TO) tuning and the thermal crosstalk it can induce are significant. The latency, which can be in the microsecond range, impacts the overall throughput of the accelerator and the thermal crosstalk impacts its reliable operation. At the architectural level it is also necessary to ensure that the NN processing is done efficiently while making use of the photonic resources in terms of wavelengths, and NN model-aware decisions in terms of device deployment, arrangement, and multiply and accumulate (MAC) unit design have to be performed. To address these challenges, the major contributions of this thesis are focused on proposing a hardware-software co-design framework to enable high throughput, low latency, and energy-efficient AI acceleration across various neural network models, using silicon photonics. At the architectural level, we have proposed wavelength reuse schemes, vector decomposition, and NN-aware MAC unit designs for increased efficiency in laser power consumption. In terms of NN-aware designs, we have proposed layer-specific acceleration units, photonic batch normalization folding, and fine-grained sparse NN acceleration units. To tackle the reliability challenges introduced by FPV, we have performed device-level design-space exploration and optimization to design MRs that are more tolerant to FPVs than the state-of-the-art efforts in this area. We also adapt Thermal Eigen-mode decomposition and have devised various novel techniques to manage thermal and spectral crosstalk sources, allowing our silicon photonic-based AI accelerators to reach up to 16-bit parameter resolution per MR, which enables high accuracy for most NN models.Item Open Access Performance assessment of multi-walled carbon nanotube interconnects using advanced polynomial chaos schemes(Colorado State University. Libraries, 2019) Bhatnagar, Sakshi, author; Nikdast, Mahdi, advisor; Pezeshki, Ali, committee member; Estep, Donald, committee memberWith the continuous miniaturization in the latest VLSI technologies, manufacturing uncertainties at nanoscale processes and operations are unpredictable at the chip level, packaging level and at board levels of integrated systems. To overcome such issues, simulation solvers to model forward propagation of uncertainties or variations in random processes at the device level to the network response are required. Polynomial Chaos Expansion (PCE) of the random variables is the most common technique to model the unpredictability in the systems. Existing methods for uncertainty quantification have a major drawback that as the number of random variables in a system increases, its computational cost and time increases in a polynomial fashion. In order to alleviate the poor scalability of standard PC approaches, predictor-corrector polynomial chaos scheme and hyperbolic polynomial chaos expansion (HPCE) scheme are being proposed in this dissertation. In the predictor-corrector polynomial scheme, low-fidelity meta-model is generated using Equivalent Single Conductor (ESC) approximation model and then its accuracy is enhanced using low order multi-conductor circuit (MCC) model called a corrector model. In HPCE, sparser polynomial expansion is generated based on the hyperbolic criterion. These schemes result in an immense reduction in CPU cost and speed. This dissertation presents the novel approach to quantify the uncertainties in multi-walled carbon nano-tubes using these schemes. The accuracy and validation of these schemes are shown using various numerical examples.