Decentralized LLM Inference over Edge Networks with Energy Harvesting
Aria Khoshsirat, Giovanni Perin, Michele Rossi

TL;DR
This paper introduces a sustainable, energy-aware decentralized inference framework for large language models on battery-powered edge devices with energy harvesting, optimizing device uptime and network throughput.
Contribution
It develops a semi-Markov model and scheduling algorithms tailored for energy harvesting edge devices, enabling energy-efficient decentralized LLM inference.
Findings
Effective scheduling algorithms reduce device downtime.
Empirical evaluations show improved network throughput.
Validated approach for energy-efficient decentralized inference.
Abstract
Large language models have significantly transformed multiple fields with their exceptional performance in natural language tasks, but their deployment in resource-constrained environments like edge networks presents an ongoing challenge. Decentralized techniques for inference have emerged, distributing the model blocks among multiple devices to improve flexibility and cost effectiveness. However, energy limitations remain a significant concern for edge devices. We propose a sustainable model for collaborative inference on interconnected, battery-powered edge devices with energy harvesting. A semi-Markov model is developed to describe the states of the devices, considering processing parameters and average green energy arrivals. This informs the design of scheduling algorithms that aim to minimize device downtimes and maximize network throughput. Through empirical evaluations and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Molecular Communication and Nanonetworks · Neural Networks and Reservoir Computing
