Sustainable carbon-aware and water-efficient LLM scheduling in geo-distributed cloud datacenters

Moore, Hayden, author; Qi, Sirui, author; Hogade, Ninad, author; Milojicic, Dejan, author; Bash, Cullen, author; Pasricha, Sudeep, author; ACM, publisher

Sustainable carbon-aware and water-efficient LLM scheduling in geo-distributed cloud datacenters

dc.contributor.author	Moore, Hayden, author
dc.contributor.author	Qi, Sirui, author
dc.contributor.author	Hogade, Ninad, author
dc.contributor.author	Milojicic, Dejan, author
dc.contributor.author	Bash, Cullen, author
dc.contributor.author	Pasricha, Sudeep, author
dc.contributor.author	ACM, publisher
dc.date.accessioned	2025-09-25T18:41:06Z
dc.date.available	2025-09-25T18:41:06Z
dc.date.issued	2025-06-29
dc.description.abstract	In recent years, Large Language Models (LLM) such as ChatGPT, Copilot, and Gemini have been widely adopted in different areas. As the use of LLMs continues to grow, many efforts have focused on reducing the massive training overheads of these models. But it is the environmental impact of handling user requests to LLMs that is increasingly becoming a concern. Recent studies estimate that the costs of operating LLMs in their inference phase can exceed training costs by 25× per year. As LLMs are queried incessantly, the cumulative carbon footprint for the operational phase has been shown to far exceed the footprint during the training phase. Further, estimates indicate that 500 ml of fresh water is expended for every 20-50 requests to LLMs during inference. To address these important sustainability issues with LLMs, we propose a novel framework called SLIT to co-optimize LLM quality of service (time-to-first token), carbon emissions, water usage, and energy costs. The framework utilizes a machine learning (ML) based metaheuristic to enhance the sustainability of LLM hosting across geo-distributed cloud datacenters. Such a framework will become increasingly vital as LLMs proliferate.
dc.format.medium	born digital
dc.format.medium	articles
dc.identifier.bibliographicCitation	Hayden Moore, Sirui Qi, Ninad Hogade, Dejan Milojicic, Cullen Bash, and Sudeep Pasricha. 2025. Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters. In Great Lakes Symposium on VLSI 2025 (GLSVLSI '25), June 30-July 02, 2025, New Orleans, LA, USA. ACM, New York, NY, USA, 6 pages. https://doi.org/10.1145/3716368.3735301
dc.identifier.doi	https://doi.org/10.1145/3716368.3735301
dc.identifier.uri	https://hdl.handle.net/10217/242040
dc.language	English
dc.language.iso	eng
dc.publisher	Colorado State University. Libraries
dc.relation.ispartof	Publications
dc.relation.ispartof	ACM DL Digital Library
dc.rights	©Hayden Moore, et al. ACM 2025. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in GLSVLSI '25, https://dx.doi.org/10.1145/3716368.3735301.
dc.subject	large language model
dc.subject	carbon emissions
dc.subject	water
dc.subject	energy cost
dc.title	Sustainable carbon-aware and water-efficient LLM scheduling in geo-distributed cloud datacenters
dc.type	Text

Files

Original bundle

Now showing 1 - 1 of 1

Name:: FACF_ACMOA_3716368.3735301.pdf
Size:: 1.58 MB
Format:: Adobe Portable Document Format

Download

Collections

Publications