Trust-Aware Routing for Distributed Generative AI Inference at the Edge
Chanh Nguyen, Erik Elmroth

TL;DR
G-TRAC is a trust-aware routing framework for distributed generative AI inference at the edge, improving robustness and reliability in decentralized environments.
Contribution
It introduces a polynomial-time routing algorithm and a hybrid trust architecture tailored for dynamic, heterogeneous edge networks.
Findings
Significantly improves inference completion rates.
Effectively isolates unreliable peers.
Maintains robustness under node failures and network partitions.
Abstract
Emerging deployments of Generative AI increasingly execute inference across decentralized and heterogeneous edge devices rather than on a single trusted server. In such environments, a single device failure or misbehavior can disrupt the entire inference process, making traditional best-effort peer-to-peer routing insufficient. Coordinating distributed generative inference therefore requires mechanisms that explicitly account for reliability, performance variability, and trust among participating peers. In this paper, we present G-TRAC, a trust-aware coordination framework that integrates algorithmic path selection with system-level protocol design to ensure robust distributed inference. First, we formulate the routing problem as a \textit{Risk-Bounded Shortest Path} computation and introduce a polynomial-time solution that combines trust-floor pruning with Dijkstra's search,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
