Deciding What to Learn: A Rate-Distortion Approach

Dilip Arumugam; Benjamin Van Roy

arXiv:2101.06197·cs.LG·June 23, 2021·1 cites

Deciding What to Learn: A Rate-Distortion Approach

Dilip Arumugam, Benjamin Van Roy

PDF

Open Access 1 Video

TL;DR

This paper introduces a rate-distortion based framework that allows agents to autonomously balance learning costs and policy optimality, guided by a single preference parameter, improving decision-making efficiency.

Contribution

It presents a novel rate-distortion approach enabling agents to self-determine learning targets based on designer preferences, reducing the need for fixed learning objectives.

Findings

01

Established a bound on expected discounted regret.

02

Demonstrated the method's ability to express designer preferences.

03

Showed improvements over Thompson sampling in experiments.

Abstract

Agents that learn to select optimal actions represent a prominent focus of the sequential decision-making literature. In the face of a complex environment or constraints on time and resources, however, aiming to synthesize such an optimal policy can become infeasible. These scenarios give rise to an important trade-off between the information an agent must acquire to learn and the sub-optimality of the resulting policy. While an agent designer has a preference for how this trade-off is resolved, existing approaches further require that the designer translate these preferences into a fixed learning target for the agent. In this work, leveraging rate-distortion theory, we automate this process such that the designer need only express their preferences via a single hyperparameter and the agent is endowed with the ability to compute its own learning targets that best achieve the desired…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Deciding What to Learn: A Rate-Distortion Approach· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and Algorithms