Relative Behavioral Attributes: Filling the Gap between Symbolic Goal   Specification and Reward Learning from Human Preferences

Lin Guan; Karthik Valmeekam; Subbarao Kambhampati

arXiv:2210.15906·cs.AI·March 1, 2023

Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences

Lin Guan, Karthik Valmeekam, Subbarao Kambhampati

PDF

Open Access 1 Video

TL;DR

This paper introduces Relative Behavioral Attributes, enabling users to efficiently modify AI agent behaviors through symbolic concepts, significantly reducing feedback requirements compared to traditional preference-based methods.

Contribution

The paper proposes a novel approach to model behavioral attributes from behavior clips, allowing intuitive behavior adjustments with minimal user feedback.

Findings

01

Effective modeling of behavioral attributes from clips

02

Reduced feedback needed for behavior customization

03

Successful application across multiple tasks and attributes

Abstract

Generating complex behaviors that satisfy the preferences of non-expert users is a crucial requirement for AI agents. Interactive reward learning from trajectory comparisons (a.k.a. RLHF) is one way to allow non-expert users to convey complex objectives by expressing preferences over short clips of agent behaviors. Even though this parametric method can encode complex tacit knowledge present in the underlying tasks, it implicitly assumes that the human is unable to provide richer feedback than binary preference labels, leading to intolerably high feedback complexity and poor user experience. While providing a detailed symbolic closed-form specification of the objectives might be tempting, it is not always feasible even for an expert user. However, in most cases, humans are aware of how the agent should change its behavior along meaningful axes to fulfill their underlying purpose, even…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences· slideslive

Taxonomy

TopicsHuman Pose and Action Recognition · Time Series Analysis and Forecasting · Anomaly Detection Techniques and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Attentive Walk-Aggregating Graph Neural Network