Green-LLM: Optimal Workload Allocation for Environmentally-Aware Distributed Inference
By: Jiaming Cheng, Duong Tung Nguyen
Potential Business Impact:
Smartly sends AI tasks to save power and money.
This letter investigates the optimal allocation of large language model (LLM) inference workloads across heterogeneous edge data centers (DCs) over time. Each DC features on-site renewable generation and faces dynamic electricity prices and spatiotemporal variability in renewable availability. The central question is: how can inference workloads be optimally distributed to the DCs to minimize energy consumption, carbon emissions, and water usage while enhancing user experience? This letter proposes a novel optimization model for LLM service providers to reduce operational costs and environmental impacts. Numerical results validate the efficacy of the proposed approach.
Similar Papers
Large Language Model-Based Task Offloading and Resource Allocation for Digital Twin Edge Computing Networks
Networking and Internet Architecture
Helps cars share computing power to avoid delays.
Energy-Aware LLMs: A step towards sustainable AI for downstream applications
Performance
Saves energy while making AI smarter.
Constraint-Compliant Network Optimization through Large Language Models
Networking and Internet Architecture
Makes computer networks follow rules perfectly.