Learning Transparent Reward Models via Unsupervised Feature Selection

Daulet Baimukashev; Gokhan Alcan; Kevin Sebastian Luck; Ville Kyrki

arXiv:2410.18608·cs.RO·May 5, 2025

Learning Transparent Reward Models via Unsupervised Feature Selection

Daulet Baimukashev, Gokhan Alcan, Kevin Sebastian Luck, Ville Kyrki

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a method to create simple, transparent reward models by automatically selecting key state features, enabling effective policy learning in complex robotic tasks.

Contribution

It presents a novel approach for constructing compact, interpretable reward functions through unsupervised feature selection, improving policy learning from expert data.

Findings

01

Effective in robotic environments with high-dimensional states

02

Produces explicit, interpretable reward models

03

Enables training of policies that mimic expert behavior

Abstract

In complex real-world tasks such as robotic manipulation and autonomous driving, collecting expert demonstrations is often more straightforward than specifying precise learning objectives and task descriptions. Learning from expert data can be achieved through behavioral cloning or by learning a reward function, i.e., inverse reinforcement learning. The latter allows for training with additional data outside the training distribution, guided by the inferred reward function. We propose a novel approach to construct compact and transparent reward models from automatically selected state features. These inferred rewards have an explicit form and enable the learning of policies that closely match expert behavior by training standard reinforcement learning algorithms from scratch. We validate our method's performance in various robotic environments with continuous and high-dimensional state…

Peer Reviews

Decision·CoRL 2024

Reviewer 01Rating 3Confidence 4

Reviewer 02Rating 3Confidence 4

Reviewer 03Rating 3Confidence 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare