FAuNO: Semi-Asynchronous Federated Reinforcement Learning Framework for Task Offloading in Edge Systems
Frederico Metelo, Alexandre Oliveira, Stevo Rackovi\'c, Pedro \'Akos Costa, Cl\'audia Soares

TL;DR
FAuNO introduces a semi-asynchronous federated reinforcement learning framework for decentralized task offloading in edge systems, improving efficiency and reducing latency through a novel actor-critic architecture.
Contribution
It proposes a new federated RL framework with an actor-critic architecture for decentralized edge task offloading, enhancing cooperation and performance.
Findings
Outperforms heuristic and federated multi-agent RL baselines in reducing task loss.
Consistently matches or exceeds baseline performance in dynamic scenarios.
Demonstrates adaptability to edge computing environments.
Abstract
Edge computing addresses the growing data demands of connected-device networks by placing computational resources closer to end users through decentralized infrastructures. This decentralization challenges traditional, fully centralized orchestration, which suffers from latency and resource bottlenecks. We present \textbf{FAuNO} -- \emph{Federated Asynchronous Network Orchestrator} -- a buffered, asynchronous \emph{federated reinforcement-learning} (FRL) framework for decentralized task offloading in edge systems. FAuNO adopts an actor-critic architecture in which local actors learn node-specific dynamics and peer interactions, while a federated critic aggregates experience across agents to encourage efficient cooperation and improve overall system performance. Experiments in the \emph{PeersimGym} environment show that FAuNO consistently matches or exceeds heuristic and federated…
Peer Reviews
Decision·Submitted to ICLR 2026
1. This paper provides a detailed review in Background & Related Work. 2. This paper presents a comprehensive process for building system model.
1.The description of the global component requires improvement. The relationship between Eq.11 and Eq.12 is unclear, and their connection to Figure 2 lacks detailed explanation. 2.The description in Fig. 2 is redundant and confusing, unable to convey the paper's idea. 3.The method proposed in this paper mostly relies on combining existing approaches (the PPO algorithm and FedBuff), which lacks innovation. 4.The paper has limited baselines for comparison, only one heuristic method and one FRL -ba
1.FedBuff's "buffered semi-asynchronous" concept is introduced into federated reinforcement learning, and only the critic is federated, taking into account both personalization and sample efficiency. The engineering implementation is complete and open source. 2.The article also performs communication-level event simulation on PeersimGym, which more closely resembles real-world edge links.
1.Insufficient theoretical analysis. The paper models task offloading as POMG and proposes a framework of "actor local, critic federation, and semi-asynchronous buffer aggregation." However, it lacks convergence or upper bounds on error under semi-asynchronous and staleness conditions. It also fails to analyze the estimation bias of federated critics under non-IID or distribution drift conditions. 2.Ablation depth. Since the core claim hinges on critic-only federation + semi-async, include ablat
- The paper addresses an important practical problem (handling stragglers in federated edge RL) with a sensible approach combining buffered asynchronous aggregation and actor-critic MARL. - The presentation is generally clear, with helpful visualizations (Fig. 2) and a logical flow from problem formulation to experiments. - The experimental evaluation includes multiple network topologies (Ether-based and random), ablations on buffer size (Table 8) and packet drops, and a heterogeneity analysis (
While FAuNO is cleanly implemented and well-motivated, a few important weaknesses keep it from being fully convincing in its current form: 1. Mathematical inconsistencies and unclear formulations. Several of the paper's core equations need revision or clarification: * The communication delay formula (Eq. 1) mixes logarithms with dB values, which gives units of bits/Hz rather than seconds. This should be fixed using $T = \frac{\alpha}{B \log_2(1+\text{SNR}_{\text{linear}})}$. Using natural
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Age of Information Optimization · Privacy-Preserving Technologies in Data
