Edge-Compatible Reinforcement Learning for Recommendations

James E. Kostas; Philip S. Thomas; Georgios Theocharous

arXiv:2112.05812·cs.LG·August 11, 2022

Edge-Compatible Reinforcement Learning for Recommendations

James E. Kostas, Philip S. Thomas, Georgios Theocharous

PDF

Open Access

TL;DR

This paper introduces a distributed, asynchronous reinforcement learning algorithm tailored for edge recommendation systems, capable of functioning effectively despite network delays and failures, with strong theoretical grounding and practical validation.

Contribution

It presents a novel, principled RL algorithm based on asynchronous coagent policy gradients, specifically designed for real-time, distributed edge environments.

Findings

01

Algorithm performs well even with degraded network quality.

02

The approach is theoretically grounded and suitable for asynchronous, distributed deployment.

03

Demonstrates robustness and practical effectiveness in edge recommendation scenarios.

Abstract

Most reinforcement learning (RL) recommendation systems designed for edge computing must either synchronize during recommendation selection or depend on an unprincipled patchwork collection of algorithms. In this work, we build on asynchronous coagent policy gradient algorithms \citep{kostas2020asynchronous} to propose a principled solution to this problem. The class of algorithms that we propose can be distributed over the internet and run asynchronously and in real-time. When a given edge fails to respond to a request for data with sufficient speed, this is not a problem; the algorithm is designed to function and learn in the edge setting, and network issues are part of this setting. The result is a principled, theoretically grounded RL algorithm designed to be distributed in and learn in this asynchronous environment. In this work, we describe this algorithm and a proposed class of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Recommender Systems and Techniques · Advanced Bandit Algorithms Research