Natural Language Specification of Reinforcement Learning Policies   through Differentiable Decision Trees

Pradyumna Tambwekar; Andrew Silva; Nakul Gopalan; Matthew Gombolay

arXiv:2101.07140·cs.LG·May 23, 2023

Natural Language Specification of Reinforcement Learning Policies through Differentiable Decision Trees

Pradyumna Tambwekar, Andrew Silva, Nakul Gopalan, Matthew Gombolay

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel framework enabling humans to specify initial robot behaviors using natural language, which are then converted into decision trees to warm-start reinforcement learning, improving accessibility and efficiency.

Contribution

The paper presents a new method to translate natural language into decision trees for initializing reinforcement learning policies, making autonomous systems more accessible to non-experts.

Findings

01

Achieved over 80% translation accuracy from natural language to decision trees.

02

Policies initialized with human language specifications match baseline RL performance.

03

Framework reduces domain exploration costs by leveraging natural language inputs.

Abstract

Human-AI policy specification is a novel procedure we define in which humans can collaboratively warm-start a robot's reinforcement learning policy. This procedure is comprised of two steps; (1) Policy Specification, i.e. humans specifying the behavior they would like their companion robot to accomplish, and (2) Policy Optimization, i.e. the robot applying reinforcement learning to improve the initial policy. Existing approaches to enabling collaborative policy specification are often unintelligible black-box methods, and are not catered towards making the autonomous system accessible to a novice end-user. In this paper, we develop a novel collaborative framework to allow humans to initialize and interpret an autonomous agent's behavior. Through our framework, we enable humans to specify an initial behavior model via unstructured, natural language (NL), which we convert to lexical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eleurent/highway-env
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Reinforcement Learning in Robotics