Distributed Mixture-of-Agents for Edge Inference with Large Language Models
Purbesh Mitra, Priyanka Kaswan, Sennur Ulukus

TL;DR
This paper introduces a distributed mixture-of-agents architecture for edge devices running large language models, enabling collaborative inference via decentralized gossip algorithms, with theoretical queue stability analysis and improved response quality demonstrated experimentally.
Contribution
It proposes a novel distributed MoA framework for edge LLMs, including queue stability analysis and empirical validation of response quality improvements.
Findings
Queue sizes remain bounded under certain conditions.
Distributed MoA improves response quality on AlpacaEval 2.0.
Theoretical stability conditions are validated experimentally.
Abstract
Mixture-of-Agents (MoA) has recently been proposed as a method to enhance performance of large language models (LLMs), enabling multiple individual LLMs to work together for collaborative inference. This collaborative approach results in improved responses to user prompts compared to relying on a single LLM. In this paper, we consider such an MoA architecture in a distributed setting, where LLMs operate on individual edge devices, each uniquely associated with a user and equipped with its own distributed computing power. These devices exchange information using decentralized gossip algorithms, allowing different device nodes to talk without the supervision of a centralized server. In the considered setup, different users have their own LLM models to address user prompts. Additionally, the devices gossip either their own user-specific prompts or augmented prompts to generate more refined…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Opinion Dynamics and Social Influence
