Sayer: Using Implicit Feedback to Optimize System Policies

Mathias L\'ecuyer; Sang Hoon Kim; Mihir Nanavati; Junchen Jiang,; Siddhartha Sen; Amit Sharma; Aleksandrs Slivkins

arXiv:2110.14874·cs.LG·October 29, 2021

Sayer: Using Implicit Feedback to Optimize System Policies

Mathias L\'ecuyer, Sang Hoon Kim, Mihir Nanavati, Junchen Jiang,, Siddhartha Sen, Amit Sharma, Aleksandrs Slivkins

PDF

Open Access

TL;DR

Sayer is a methodology that uses implicit feedback and reinforcement learning techniques to evaluate and optimize system policies without deployment, improving decision-making in resource management.

Contribution

It introduces a novel approach combining implicit exploration and counterfactual estimators to leverage implicit feedback for policy evaluation and training.

Findings

01

Accurately evaluates policies in Azure scenarios

02

Outperforms existing production policies

03

Demonstrates unbiased policy assessment

Abstract

We observe that many system policies that make threshold decisions involving a resource (e.g., time, memory, cores) naturally reveal additional, or implicit feedback. For example, if a system waits X min for an event to occur, then it automatically learns what would have happened if it waited <X min, because time has a cumulative property. This feedback tells us about alternative decisions, and can be used to improve the system policy. However, leveraging implicit feedback is difficult because it tends to be one-sided or incomplete, and may depend on the outcome of the event. As a result, existing practices for using feedback, such as simply incorporating it into a data-driven model, suffer from bias. We develop a methodology, called Sayer, that leverages implicit feedback to evaluate and train new system policies. Sayer builds on two ideas from reinforcement learning -- randomized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Machine Learning and Data Classification