Towards customizable reinforcement learning agents: Enabling preference   specification through online vocabulary expansion

Utkarsh Soni; Nupur Thakur; Sarath Sreedharan; Lin Guan; Mudit Verma,; Matthew Marquez; Subbarao Kambhampati

arXiv:2210.15096·cs.AI·February 2, 2023

Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion

Utkarsh Soni, Nupur Thakur, Sarath Sreedharan, Lin Guan, Mudit Verma,, Matthew Marquez, Subbarao Kambhampati

PDF

Open Access

TL;DR

This paper introduces PRESCA, a system enabling users to specify preferences for reinforcement learning agents via concepts in a shared vocabulary, learning new concepts efficiently through causal associations and data augmentation, demonstrated in Minecraft.

Contribution

PRESCA allows preference specification through concepts, learning new ones efficiently, and integrates causal and data augmentation techniques, advancing user-friendly reinforcement learning customization.

Findings

01

Effective preference alignment in Minecraft environment

02

Reduces feedback needed for concept learning

03

Improves agent behavior according to user preferences

Abstract

There is a growing interest in developing automated agents that can work alongside humans. In addition to completing the assigned task, such an agent will undoubtedly be expected to behave in a manner that is preferred by the human. This requires the human to communicate their preferences to the agent. To achieve this, the current approaches either require the users to specify the reward function or the preference is interactively learned from queries that ask the user to compare behavior. The former approach can be challenging if the internal representation used by the agent is inscrutable to the human while the latter is unnecessarily cumbersome for the user if their preference can be specified more easily in symbolic terms. In this work, we propose PRESCA (PREference Specification through Concept Acquisition), a system that allows users to specify their preferences in terms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Fuzzy Logic and Control Systems · AI-based Problem Solving and Planning

MethodsALIGN