AI Inference as Relocatable Electricity Demand: A Latency-Constrained Energy-Geography Framework
Xubin Luo, Cheng Yang

TL;DR
This paper develops a framework to analyze how geo-distributed AI inference can shift electricity demand geographically, balancing latency constraints with energy and carbon efficiency.
Contribution
It introduces a novel energy-geography model for inference placement, operational metrics, and a stylized simulation demonstrating the impact of latency and migration frictions.
Findings
Latency relaxation broadens feasible geographic regions for inference.
Migration frictions and legal constraints significantly limit relocation benefits.
Heterogeneous latency tolerance divides workloads into local, regional, and energy-focused layers.
Abstract
AI inference is becoming a persistent and geographically distributed source of electricity demand. Unlike many traditional electrical loads, inference workloads can sometimes be executed away from the user-facing service location, provided that latency, state locality, capacity, and regulatory constraints remain acceptable. This paper studies when such digital relocation of computation can be interpreted as latency-constrained relocation of electricity demand. We develop an energy-geography framework for geo-distributed AI inference. The framework models a three-layer architecture of clients, service nodes, and compute nodes, and formulates inference placement as a constrained optimization problem over electricity prices, marginal carbon intensity, power usage effectiveness, compute capacity, network latency, and migration frictions. The key object is the energy-latency frontier: the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
