Repository logo
 

Publications

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 565
  • ItemOpen Access
    SHIELD: sustainable hybrid evolutionary learning framework for carbon, wastewater, and energy-aware data center management
    (Colorado State University. Libraries, 2024-05-09) Qi, Sirui, author; Milojicic, Dejan, author; Bash, Cullen, author; Pasricha, Sudeep, author; ACM, publisher
    Today's cloud data centers are often distributed geographically to provide robust data services. But these geo-distributed data centers (GDDCs) have a significant associated environmental impact due to their increasing carbon emissions and water usage, which needs to be curtailed. Moreover, the energy costs of operating these data centers continue to rise. This paper proposes a novel framework to co-optimize carbon emissions, water footprint, and energy costs of GDDCs, using a hybrid workload management framework called SHIELD that integrates machine learning guided local search with a decomposition-based evolutionary algorithm. Our framework considers geographical factors and time-based differences in power generation/use, costs, and environmental impacts to intelligently manage workload distribution across GDDCs and data center operation. Experimental results show that SHIELD can realize 34.4× speedup and 2.1× improvement in Pareto Hypervolume while reducing the carbon footprint by up to 3.7×, water footprint by up to 1.8×, energy costs by up to 1.3×, and a cumulative improvement across all objectives (carbon, water, cost) of up to 4.8× compared to the state-of-the-art.
  • ItemOpen Access
    SerIOS: enhancing hardware security in integrated optoelectronic systems
    (Colorado State University. Libraries, 2024-06-21) Göhring de Magalhães, Felipe, author; Nikdast, Mahdi, author; Nicolescu, Gabriela, author; ACM, publisher
    Silicon photonics (SiPh) has different applications, from enabling fast and high-bandwidth communication for high-performance computing systems to realizing energy-efficient optical computation for AI hardware accelerators. However, integrating SiPh with electronic sub-systems can introduce new security vulnerabilities that cannot be adequately addressed using existing hardware security solutions for electronic systems. This paper introduces SerIOS, the first framework aimed at enhancing hardware security in optoelectronic systems by leveraging the unique properties of optical lithography. SerIOS employs cryptographic keys generated based on imperfections in the optical lithography process and an online detection mechanism to detect attacks. Simulation and synthesis results demonstrate SerIOS's effectiveness in detecting and preventing attacks, with a small area footprint of less than 15% and a 100% detection rate across various attack scenarios and optoelectronic architectures, including photonic AI accelerators.
  • ItemOpen Access
    Life-after-death: exploring thermal annealing conditions to enhance 3D NAND SSD endurance
    (Colorado State University. Libraries, 2024-07-08) Buddhanoy, Matchima, author; Pasricha, Sudeep, author; Ray, Biswajit, author; ACM, publisher
    In this paper, we evaluate thermal annealing effects on the endurance of commercial off-the-shelf (COTS) 3D NAND flash memory beyond its end-of-life. We systematically evaluate the effects of anneal duration, anneal temperature, and state of the memory cells during annealing on the endurance enhancement. Interestingly, we find that endurance enhancement critically depends on the state of flash memory cells during annealing, with programmed cells showing significantly larger improvements than erased cells. Our experimental evaluation indicates that the post-cycle data retention property of an annealed chip significantly improves after thermal annealing, resulting in ∼30% endurance recovery. Our results have significant implications for the future wear-leveling algorithms of SSD-based storage systems.
  • ItemOpen Access
    Improving block management in 3D NAND flash SSDs with sub-block first write sequencing
    (Colorado State University. Libraries, 2024-06-12) Buddhanoy, Matchima, author; Khan, Kamil, author; Milenkovic, Aleksandar, author; Pasricha, Sudeep, author; Ray, Biswajit, author; ACM, publisher
    Continual vertical scaling in 3D NAND flash solid-state drives (SSDs) results in larger memory blocks, causing performance degradation due to big-block management issues. Pages within a 3D NAND flash block are traditionally written using layer first write sequencing. This paper introduces and explores the benefits of an alternative sub-block first write sequence. This method when coupled with sub-block erase operations promises to alleviate the big-block problem. Our evaluation on a commercial 32-layer 3D NAND flash SSD chip shows that though the proposed method increases the raw bit error rate (RBER), it remains below the threshold that can be corrected by error correction codes (ECCs). Simulation analysis further shows that our proposed method reduces garbage collection overhead, resulting in 36.0% lower response time and 9.6% reduction in additional writes due to garbage collection compared to traditional 3D NAND flash SSDs.
  • ItemOpen Access
    RISA: round-robin intra-rack friendly scheduling algorithm for disaggregated datacenters
    (Colorado State University. Libraries, 2023-11-12) Kabir, Rashadul, author; Kim, Ryan G., author; Nikdast, Mahdi, author; ACM, publisher
    Recent trends see a move away from a fixed-resource server-centric datacenter model to a more adaptable "disaggregated" datacenter model. These disaggregated datacenters can then dynamically group resources to the specific requirements of an incoming workload, thereby improving efficiency. To properly utilize these disaggregated datacenters, workload allocation techniques must examine the current state of the datacenter and choose resources that not only optimize the current workload request, but future ones. Since disaggregated datacenters are severely bottlenecked by the available network resources, our work proposes a heuristic-based approach called RISA, which significantly reduces the network usage of workload allocations in disaggregated datacenters. Compared to the state-of-the-art, RISA reduces the power consumption for optical components by 33% and reduces the average CPU-RAM round-trip latency by 50%. Additionally, RISA significantly outperforms the state-of-the-art in terms of execution time.
  • ItemOpen Access
    SCRIPT: a multi-objective routing framework for securing chiplet systems against distributed DoS attacks
    (Colorado State University. Libraries, 2024-06-12) Taheri, Ebadollah, author; Aghanoury, Pooya, author; Pasricha, Sudeep, author; Nikdast, Mahdi, author; Sehatbakhsh, Nader, author; ACM, publisher
    Heterogeneous 2.5D integration enables seamless integration of chiplets, hence reducing design time and costs. Concerns arise when dealing with untrustworthy chiplets, emphasizing the need for dependable Network-on-Interposer (NoI). This paper introduces SCRIPT, a secure routing framework to mitigate Distributed Denial-of-Service (DDoS) attacks in chiplet systems. SCRIPT obscures predictable paths exploited by attackers, disrupting orchestrated attacks. SCRIPT considers chiplet trust and criticality and employs a multi-objective optimization technique to enhance NoI performance and reliability. Evaluations show that SCRIPT enhances NoI security by at least 64% against DDoS attacks.
  • ItemOpen Access
    FedHIL: heterogeneity resilient federated learning for robust indoor localization with mobile devices
    (Colorado State University. Libraries, 2023-09-09) Gufran, Danish, author; Pasricha, Sudeep, author; ACM, publisher
    Indoor localization plays a vital role in applications such as emergency response, warehouse management, and augmented reality experiences. By deploying machine learning (ML) based indoor localization frameworks on their mobile devices, users can localize themselves in a variety of indoor and subterranean environments. However, achieving accurate indoor localization can be challenging due to heterogeneity in the hardware and software stacks of mobile devices, which can result in inconsistent and inaccurate location estimates. Traditional ML models also heavily rely on initial training data, making them vulnerable to degradation in performance with dynamic changes across indoor environments. To address the challenges due to device heterogeneity and lack of adaptivity, we propose a novel embedded ML framework called FedHIL. Our framework combines indoor localization and federated learning (FL) to improve indoor localization accuracy in device-heterogeneous environments while also preserving user data privacy. FedHIL integrates a domain-specific selective weight adjustment approach to preserve the ML model's performance for indoor localization during FL, even in the presence of extremely noisy data. Experimental evaluations in diverse real-world indoor environments and with heterogeneous mobile devices show that FedHIL outperforms state-of-the-art FL and non-FL indoor localization frameworks. FedHIL is able to achieve 1.62 × better localization accuracy on average than the best performing FL-based indoor localization framework from prior work.
  • ItemOpen Access
    Design space exploration for PCM-based photonic memory
    (Colorado State University. Libraries, 2023-06-05) Shafiee, Amin, author; Charbonnier, Benoit, author; Pasricha, Sudeep, author; Nikdast, Mahdi, author; ACM, publisher
    The integration of silicon photonics (SiPh) and phase change materials (PCMs) has created a unique opportunity to realize adaptable and reconfigurable photonic systems. In particular, the nonvolatile programmability in PCMs has made them a promising candidate for implementing optical memory systems. In this paper, we describe the design of an optical memory cell based on PCMs while exploring the design space of the cell in terms of PCM material choice (e.g., GST, GSST, Sb2Se3), cell bit capacity, latency, and power consumption. Leveraging this design-space exploration for the design of efficient optical memory cells, we present the design and implementation of an optical memory array and explore its scalability and power consumption when using different optical memory cells. We also identify performance bottlenecks that need to be alleviated to further scale optical memory arrays with competitive latency and energy consumption, compared to their electronic counterparts.
  • ItemOpen Access
    TRINE: a tree-based silicon photonic interposer network for energy-efficient 2.5D machine learning acceleration
    (Colorado State University. Libraries, 2023-10-28) Taheri, Ebadollah, author; Mahdian, Mohammad Amin, author; Pasricha, Sudeep, author; Nikdast, Mahdi, author; ACM, publisher
    2.5D chiplet systems have showcased low manufacturing costs and modular designs for machine learning (ML) acceleration. Nevertheless, communication challenges arise from chiplet interconnectivity and high-bandwidth demands among chiplets. To address these challenges, we present TRINE, a novel tree-based silicon photonic interposer network for energy-efficient ML acceleration. Leveraging silicon photonics and broadband optical switching, TRINE enables efficient inter-chiplet communication with reduced latency and improved energy efficiency. Considering several ML workloads, our simulation results demonstrate significant improvements in the average energy efficiency by 61.7% and 40% when comparing TRINE with two recently proposed silicon photonic interposer networks. By overcoming communication limitations in 2.5D ML accelerators, this work is a promising step towards advancing 2.5D photonic-based ML accelerator design.
  • ItemOpen Access
    GHOST: a graph neural network accelerator using silicon photonics
    (Colorado State University. Libraries, 2023-09-09) Afifi, Salma, author; Sunny, Febin, author; Shafiee, Amin, author; Nikdast, Mahdi, author; Pasricha, Sudeep, author; ACM, publisher
    Graph neural networks (GNNs) have emerged as a powerful approach for modelling and learning from graph-structured data. Multiple fields have since benefitted enormously from the capabilities of GNNs, such as recommendation systems, social network analysis, drug discovery, and robotics. However, accelerating and efficiently processing GNNs require a unique approach that goes beyond conventional artificial neural network accelerators, due to the substantial computational and memory requirements of GNNs. The slowdown of scaling in CMOS platforms also motivates a search for alternative implementation substrates. In this paper, we present GHOST, the first silicon-photonic hardware accelerator for GNNs. GHOST efficiently alleviates the costs associated with both vertex-centric and edge-centric operations. It implements separately the three main stages involved in running GNNs in the optical domain, allowing it to be used for the inference of various widely used GNN models and architectures, such as graph convolution networks and graph attention networks. Our simulation studies indicate that GHOST exhibits at least 10.2 × better throughput and 3.8 × better energy efficiency when compared to GPU, TPU, CPU and multiple state-of-the-art GNN hardware accelerators.
  • ItemOpen Access
    TRON: transformer neural network acceleration with non-coherent silicon photonics
    (Colorado State University. Libraries, 2023-06-05) Afifi, Salma, author; Sunny Febin, author; Nikdast, Mahdi, author; Pasricha, Sudeep, author; ACM, publisher
    Transformer neural networks are rapidly being integrated into state-of-the-art solutions for natural language processing (NLP) and computer vision. However, the complex structure of these models creates challenges for accelerating their execution on conventional electronic platforms. We propose the first silicon photonic hardware neural network accelerator called TRON for transformer-based models such as BERT, and Vision Transformers. Our analysis demonstrates that TRON exhibits at least 14× better throughput and 8× better energy efficiency, in comparison to state-of-the-art transformer accelerators.
  • ItemOpen Access
    Ethics in computing education: challenges and experience with embedded ethics
    (Colorado State University. Libraries, 2023-06-05) Pasricha, Sudeep, author; ACM, publisher
    The next generation of computer engineers and scientists must be proficient in not just the technical knowledge required to analyze, optimize, and create computing systems, but also with the skills required to make ethical decisions during design. Teaching computer ethics in computing curricula is therefore becoming an important requirement with significant ramifications for our increasingly connected and computing-reliant society. In this paper, we reflect on the many challenges and questions with effectively integrating ethics into modern computing curricula. We describe a case study of integrating ethics modules into the computer engineering curricula at Colorado State University.
  • ItemOpen Access
    Cross-layer design for AI acceleration with non-coherent optical computing
    (Colorado State University. Libraries, 2023-06-05) Sunny, Febin, author; Nikdast, Mahdi, author; Pasricha, Sudeep, author; ACM, publisher
    Emerging AI applications such as ChatGPT, graph convolutional networks, and other deep neural networks require massive computational resources for training and inference. Contemporary computing platforms such as CPUs, GPUs, and TPUs are struggling to keep up with the demands of these AI applications. Non-coherent optical computing represents a promising approach for light-speed acceleration of AI workloads. In this paper, we show how cross-layer design can overcome challenges in non-coherent optical computing platforms. We describe approaches for optical device engineering, tuning circuit enhancements, and architectural innovations to adapt optical computing to a variety of AI workloads. We also discuss techniques for hardware/ software co-design that can intelligently map and adapt AI software to improve performance on non-coherent platforms.
  • ItemOpen Access
    A Bayesian correction approach for improving dual-frequency precipitation radar rainfall rate estimates
    (Colorado State University. Libraries, 2020-01-27) Ma, Yingzhao, author; Chandrasekar, V., author; Biswas, Sounak K., author; Journal of Meteorological Society of Japan, publisher
    The accurate estimation of precipitation is an important objective for the Dual-frequency Precipitation Radar (DPR), which is located on board the Global Precipitation Measurement (GPM) satellite core observatory. In this study, a Bayesian correction (BC) approach is proposed to improve the DPR’s instantaneous rainfall rate product. Ground dual-polarization radar (GR) observations are used as references, and a log-transformed Gaussian distribution is assumed as the instantaneous rainfall process. Additionally, a generalized regression model is adopted in the BC algorithm. Rainfall intensities such as light, moderate, and heavy rain and their variable influences on the model’s performance are considered. The BC approach quantifies the predictive uncertainties associated with the Bayesiancorrected DPR (DPR_BC) rainfall rate estimates. To demonstrate the concepts developed in this study, data from the GPM overpasses of the Weather Service Surveillance Radar (WSR-88D), KHGX, in Houston, Texas, between April 2014 and June 2018 are used. Observation errors in the DPR instantaneous rainfall rate estimates are analyzed as a function of rainfall intensity. Moreover, the best-performing BC model is implemented in three GPM-overpass cases with heavy rainfall records across the southeastern United States. The results show that the DPR_BC rainfall rate estimates have superior skill scores and are in better agreement with the GR references than with the DPR estimates. This study demonstrates the potential of the proposed BC algorithm for enhancing the instantaneous rainfall rate product from spaceborne radar equipment.
  • ItemOpen Access
    Wavefront improvement in an injection-seeded soft x-ray laser based on a solid-target plasma amplifier
    (Colorado State University. Libraries, 2013-10-15) Li, Lu, author; Wang, Yong, author; Wang, Shoujun, author; Oliva, Eduardo, author; Yin, Liang, author; Le, T. T. Thuy, author; Daboussi, Sameh, author; Ros, David, author; Maynard, Gilles, author; Sebban, Stephane, author; Hu, Bitao, author; Rocca, Jorge J, author; Zeitoun, Philippe, author; Optical Society of America, publisher
    The wavefront of an injection-seeded soft x-ray laser beam generated by amplification of high-harmonic pulses in a λ=18.9 nm molybdenum plasma amplifier was measured by a Hartmann wavefront sensor with an accuracy of λ/32 root mean square (rms). A significant improvement in wavefront aberrations of 0.51±0.03λ rms to 0.23±0.01λ rms was observed as a function of plasma column length. The variation of wavefront characteristic as a function time delay between the injection of the seed and peak of soft x-ray amplifier pump was studied. The measurements were used to reconstruct the soft x-ray source and confirm its high peak brightness.
  • ItemOpen Access
    Plasmoid ejection and secondary current sheet generation from magnetic reconnection in laser-plasma interaction
    (Colorado State University. Libraries, 2012-05-25) Dong, Quan-Li, author; Wang, Shou-Jun, author; Lu, Quan-Ming, author; Huang, Can, author; Yuan, Da-Wei, author; Liu, Xun, author; Lin, Xiao-Xuan, author; Li, Yu-Tong, author; Wei, Hui-Gang, author; Zhong, Jia-Yong, author; Shi, Jian-Rong, author; Jiang, Shao-En, author; Ding, Yong-Kun, author; Jiang, Bo-Bin, author; Du, Kai, author; He, Xian-Tu, author; Yu, M. Y., author; Liu, C. S., author; Wang, Shui, author; Tang, Yong-Jian, author; Zhu, Jian-Qiang, author; Zhao, Gang, author; Sheng, Zheng-Ming, author; Zhang, Jie, author; American Physical Society, publisher
    Reconnection of the self-generated magnetic fields in laser-plasma interaction was first investigated experimentally by Nilson et al. [Phys. Rev. Lett. 97, 255001 (2006)] by shining two laser pulses a distance apart on a solid target layer. An elongated current sheet (CS) was observed in the plasma between the two laser spots. In order to more closely model magnetotail reconnection, here two side-by-side thin target layers, instead of a single one, are used. It is found that at one end of the elongated CS a fanlike electron outflow region including three well-collimated electron jets appears. The (>1 MeV) tail of the jet energy distribution exhibits a power-law scaling. The enhanced electron acceleration is attributed to the intense inductive electric field in the narrow electron dominated reconnection region, as well as additional acceleration as they are trapped inside the rapidly moving plasmoid formed in and ejected from the CS. The ejection also induces a secondary CS.
  • ItemOpen Access
    Single-shot soft x-ray laser linewidth measurement using a grating interferometer
    (Colorado State University. Libraries, 2013-12-01) Wang, Y., author; Yin, L., author; Wang, S., author; Marconi, M. C., author; Dunn, J., author; Gullikson, E., author; Rocca, J. J., author; Optical Society of America, publisher
    The linewidth of a 14.7 nm wavelength Ni-like Pd soft x-ray laser was measured in a single shot using a soft x-ray diffraction grating interferometer. The instrument uses the time delay introduced by the gratings across the beam to measure the temporal coherence. The spectral linewidth of the 4d1S0-4p1P1 Ni-like Pd lasing line was measured to be Δλ/λ=3×10-5 from the Fourier transform of the fringe visibility. This single shot linewidth measurement technique provides a rapid and accurate way to determine the temporal coherence of soft x-ray lasers that can contribute to the development of femtosecond plasma-based soft x-ray lasers.
  • ItemOpen Access
    Efficient picosecond x-ray pulse generation from plasmas in the radiation dominated regime
    (Colorado State University. Libraries, 2017-10-27) Hollinger, Reed, author; Bargsten, Clayton, author; Shlyaptsev, Vyacheslav N., author; Kaymak, Vural, author; Pukhov, Alexander, author; Capeluto, Maria Gabriela, author; Wang, Shoujun, author; Rockwood, Alex, author; Wang, Yong, author; Townsend, Amanda, author; Prieto, Amy, author; Stockton, Patrick, author; Curtis, Alden, author; Rocca, Jorge J., author; Optical Society of America, publisher
    The efficient conversion of optical laser light into bright ultrafast x-ray pulses in laser created plasmas is of high interest for dense plasma physics studies, material science, and other fields. However, the rapid hydrodynamic expansion that cools hot plasmas has limited the x-ray conversion efficiency (CE) to 1% or less. Here we demonstrate more than one order of magnitude increase in picosecond x-ray CE by tailoring near solid density plasmas to achieve a large radiative to hydrodynamic energy loss rate ratio, leading into a radiation loss dominated plasma regime. A record 20% CE into hν > 1 keV photons was measured in arrays of large aspect ratio Au nanowires heated to keV temperatures with ultrahigh contrast femtosecond laser pulses of relativistic intensity. The potential of these bright ultrafast x-ray point sources for table-top imaging is illustrated with single shot flash radiographs obtained using low laser pulse energy. These results will enable the deployment of brighter laser driven x-ray sources at both compact andlarge laser facilities.
  • ItemOpen Access
    Characteristic measurements of silicon dioxide aerogel plasmas generated in a Planckian radiation environment
    (Colorado State University. Libraries, 2010-01-06) Dong, Quan-Li, author; Wang, Shou-Jun, author; Li, Yu-Tong, author; Zhang, Yi, author; Zhao, Jing, author; Wei, Hui-Gang, author; Shi, Jian-Rong, author; Zhao, Gang, author; Zhang, Ji-Yan, author; Gu, Yu-Qiu, author; Ding, Yong-Kun, author; Wen, Tian-Shu, author; Zhang, Wen-Hai, author; Hu, Xin, author; Liu, Shen-Ye, author; Zhang, Lin, author; Tang, Yong-Jian, author; Zhang, Bao-Han, author; Zheng, Zhi-Jian, author; Nishimura, Hiroaki, author; Fujioka, Shinsuke, author; Wang, Fei-Lu, author; Takabe, Hideaki, author; Zhang, Jie, author; American Institute of Physics, publisher
    The temporally and spatially resolved characteristics of silicon dioxide aerogel plasmas were studied using x-ray spectroscopy. The plasma was generated in the near-Planckian radiation environment within gold hohlraum targets irradiated by laser pulses with a total energy of 2.4 kJ in 1 ns. The contributions of silicon ions at different charge states to the specific components of the measured absorption spectra were also investigated. It was found that each main feature in the absorption spectra of the measured silicon dioxide aerogel plasmas was contributed by two neighboring silicon ionic species.
  • ItemOpen Access
    Micro-scale fusion in dense relativistic nanowire array plasmas
    (Colorado State University. Libraries, 2018-03-14) Curtis, Alden, author; Calvi, Chase, author; Tinsley, James, author; Hollinger, Reed, author; Kaymak, Vural, author; Pukhov, Alexander, author; Wang, Shoujun, author; Rockwood, Alex, author; Wang, Yong, author; Shlyaptsev, Vyacheslav N., author; Rocca, Jorge J., author; Nature Research, publisher
    Nuclear fusion is regularly created in spherical plasma compressions driven by multi-kilojoule pulses from the world’s largest lasers. Here we demonstrate a dense fusion environment created by irradiating arrays of deuterated nanostructures with joule-level pulses from a compact ultrafast laser. The irradiation of ordered deuterated polyethylene nanowires arrays with femtosecond pulses of relativistic intensity creates ultra-high energy density plasmas in which deuterons (D) are accelerated up to MeV energies, efficiently driving D–D fusion reactions and ultrafast neutron bursts. We measure up to 2 × 106 fusion neutrons per joule, an increase of about 500 times with respect to flat solid targets, a record yield for joule-level lasers. Moreover, in accordance with simulation predictions, we observe a rapid increase in neutron yield with laser pulse energy. The results will impact nuclear science and high energy density research and can lead to bright ultrafast quasi-monoenergetic neutron point sources for imaging and materials studies.