Active Inference-Based Adaptive Routing for Heterogeneous Edge AI Services
Zihang Wang, Boris Sedlak, Schahram Dustdar

TL;DR
This paper introduces AIF-Router, an active inference-based framework for adaptive routing of AI services in edge computing, enabling autonomous, real-time decision-making amidst infrastructure variability.
Contribution
It presents a novel active inference approach for online, self-adaptive routing in heterogeneous edge AI environments without requiring offline training.
Findings
AIF-Router effectively balances latency, throughput, and resource use in dynamic edge settings.
The framework demonstrates stable online learning despite device instability.
Applying active inference enables autonomous, adaptive service orchestration in unreliable edge environments.
Abstract
Edge computing enables AI inference closer to data sources, reducing latency and bandwidth costs. However, orchestrating AI services across the cloud-edge continuum remains challenging due to dynamic workloads and infrastructure variability. We present AIF-Router, an Active Inference--based routing framework that autonomously learns to balance latency, throughput, and resource utilization across multi-tier AI services without offline training. AIF-Router performs Bayesian state inference and expected free energy minimization to guide routing decisions based on observability-driven real-time metrics. Despite device instability on edge nodes, AIF-Router exhibits stable online learning behavior and demonstrates the feasibility of applying Active Inference for adaptive AI service orchestration in unreliable edge environments. Our findings highlight both the promise and practical challenges…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
