Edge-Compatible Reinforcement Learning for Recommendations
James E. Kostas, Philip S. Thomas, Georgios Theocharous

TL;DR
This paper introduces a distributed, asynchronous reinforcement learning algorithm tailored for edge recommendation systems, capable of functioning effectively despite network delays and failures, with strong theoretical grounding and practical validation.
Contribution
It presents a novel, principled RL algorithm based on asynchronous coagent policy gradients, specifically designed for real-time, distributed edge environments.
Findings
Algorithm performs well even with degraded network quality.
The approach is theoretically grounded and suitable for asynchronous, distributed deployment.
Demonstrates robustness and practical effectiveness in edge recommendation scenarios.
Abstract
Most reinforcement learning (RL) recommendation systems designed for edge computing must either synchronize during recommendation selection or depend on an unprincipled patchwork collection of algorithms. In this work, we build on asynchronous coagent policy gradient algorithms \citep{kostas2020asynchronous} to propose a principled solution to this problem. The class of algorithms that we propose can be distributed over the internet and run asynchronously and in real-time. When a given edge fails to respond to a request for data with sufficient speed, this is not a problem; the algorithm is designed to function and learn in the edge setting, and network issues are part of this setting. The result is a principled, theoretically grounded RL algorithm designed to be distributed in and learn in this asynchronous environment. In this work, we describe this algorithm and a proposed class of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Recommender Systems and Techniques · Advanced Bandit Algorithms Research
