Design and optimization of efficient, fault-tolerant and secure 2.5D chiplet systems
Date
2024
Journal Title
Journal ISSN
Volume Title
Abstract
In response to the burgeoning demand for high-performance computing systems, this Ph.D. dissertation investigates the pivotal challenges surrounding Networks-on-Chip (NoCs) within the framework of 2.5D and 3D integration technologies, with a primary objective of enhancing the efficiency, fault tolerance, and security of forthcoming computing system architectures. The inherent limitations in bandwidth and reliability at the boundary of chiplets in 2.5D chiplet systems engender significant challenges in traffic management, latency, and energy efficiency. Furthermore, the interconnected global network on an interposer, linking multiple chiplets, necessitates high-bandwidth, low-latency communication to accommodate the substantial traffic generated by numerous cores across diverse chiplets. This Ph.D. dissertation emphasizes various design aspects of NoCs, such as latency, energy efficiency, fault tolerance, and security. It explores the design of 3D NoCs leveraging Through-Silicon Vias (TSVs) for vertical communication. To address reliability concerns and fabrication costs associated with high TSV density, Partially Connected 3D NoC (PC-3DNoC) has been proposed. An adaptive congestion-aware TSV link selection algorithm is introduced to manage traffic load and optimize communication, resulting in reduced latency and improved energy efficiency. For 2.5D chiplet systems, a novel deadlock-free and fault-tolerant routing algorithm is presented. The fault-tolerant algorithm enhances redundancy in vertical link selection and offers improved network reachability with reduced latency compared to existing solutions, even in the presence of faults. Furthermore, to address the energy consumption concerns of silicon-photonic-based 2.5D networks, a reconfigurable power-efficient and congestion-aware silicon-photonic-based 2.5D Interposer network is proposed. The proposed photonic interposer utilizes phase change materials (PCMs) for dynamic reconfiguration and power gating of the photonic network, leading to lower latency and improved energy efficiency. Additionally, the research investigates the integration of optical computation and communication into 2.5D chiplet platforms for domain-specific machine learning (ML) processing. This approach aims to overcome limitations in computation density and communication speeds faced by traditional accelerators, paving the way for sustainable and scalable ML hardware. Furthermore, this dissertation proposes a 2.5D chiplet-based architecture utilizing a silicon-photonic-based interposer, which tackles the limitations of conventional bus-based communication by employing a novel switch-based network, achieving significant energy efficiency improvements for high-bandwidth, low-latency data movement in machine learning accelerators. The switch-based network employs our proposed optical switch based on Mach--Zehnder Interferometer (MZI) devices with a dividing state to facilitate broadcast and optimize communication for ML workloads. Finally, the dissertation explores security considerations in 2.5D chiplet systems with diverse, potentially untrusted chiplets. To address this, a secure routing framework for Network-on-Interposer is presented. The proposed secure framework protects the system against distributed denial-of-service (DDoS) attacks by concealing predictable routing paths. It leverages multi-objective optimization to balance efficiency and reliability for the NoI. The proposed contributions in this dissertation help advance the field of chip-scale interconnection networks by proposing novel techniques for improved performance, reliability, and power efficiency in 3D and 2.5D NoC architectures. These advancements hold promise for the design of future high-performance computing systems, particularly in the areas of machine learning and other computationally intensive applications.