Decentralized LLM Inference over Edge Networks with Energy Harvesting

Aria Khoshsirat; Giovanni Perin; Michele Rossi

arXiv:2408.15907·cs.DC·January 21, 2026

Decentralized LLM Inference over Edge Networks with Energy Harvesting

Aria Khoshsirat, Giovanni Perin, Michele Rossi

PDF

Open Access

TL;DR

This paper introduces a sustainable, energy-aware decentralized inference framework for large language models on battery-powered edge devices with energy harvesting, optimizing device uptime and network throughput.

Contribution

It develops a semi-Markov model and scheduling algorithms tailored for energy harvesting edge devices, enabling energy-efficient decentralized LLM inference.

Findings

01

Effective scheduling algorithms reduce device downtime.

02

Empirical evaluations show improved network throughput.

03

Validated approach for energy-efficient decentralized inference.

Abstract

Large language models have significantly transformed multiple fields with their exceptional performance in natural language tasks, but their deployment in resource-constrained environments like edge networks presents an ongoing challenge. Decentralized techniques for inference have emerged, distributing the model blocks among multiple devices to improve flexibility and cost effectiveness. However, energy limitations remain a significant concern for edge devices. We propose a sustainable model for collaborative inference on interconnected, battery-powered edge devices with energy harvesting. A semi-Markov model is developed to describe the states of the devices, considering processing parameters and average green energy arrivals. This informs the design of scheduling algorithms that aim to minimize device downtimes and maximize network throughput. Through empirical evaluations and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Molecular Communication and Nanonetworks · Neural Networks and Reservoir Computing