Cache-Aware Reinforcement Learning in Large-Scale Recommender Systems

Xiaoshuang Chen; Gengrui Zhang; Yao Wang; Yulin Wu; Shuo Su; Kaiqiao; Zhan; Ben Wang

arXiv:2404.14961·cs.LG·April 9, 2025

Cache-Aware Reinforcement Learning in Large-Scale Recommender Systems

Xiaoshuang Chen, Gengrui Zhang, Yao Wang, Yulin Wu, Shuo Su, Kaiqiao, Zhan, Ben Wang

PDF

Open Access

TL;DR

This paper introduces a cache-aware reinforcement learning method for large-scale recommender systems that optimizes recommendations by balancing real-time computation and cache usage, significantly improving user engagement.

Contribution

It proposes a novel reinforcement learning framework that jointly optimizes real-time and cached recommendations, addressing cache-induced critic dependency issues.

Findings

01

CARL improves user engagement in large-scale systems.

02

It is deployed in Kwai app serving over 100 million users.

03

The eigenfunction learning method effectively mitigates critic dependency.

Abstract

Modern large-scale recommender systems are built upon computation-intensive infrastructure and usually suffer from a huge difference in traffic between peak and off-peak periods. In peak periods, it is challenging to perform real-time computation for each request due to the limited budget of computational resources. The recommendation with a cache is a solution to this problem, where a user-wise result cache is used to provide recommendations when the recommender system cannot afford a real-time computation. However, the cached recommendations are usually suboptimal compared to real-time computation, and it is challenging to determine the items in the cache for each user. In this paper, we provide a cache-aware reinforcement learning (CARL) method to jointly optimize the recommendation by real-time computation and by the cache. We formulate the problem as a Markov decision process with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Caching and Content Delivery · Advanced Bandit Algorithms Research