Performance Optimization of Ratings-Based Reinforcement Learning

Evelyn Rose; Devin White; Mingkang Wu; Vernon Lawhern; Nicholas R.; Waytowich; Yongcan Cao

arXiv:2501.07755·cs.LG·January 15, 2025

Performance Optimization of Ratings-Based Reinforcement Learning

Evelyn Rose, Devin White, Mingkang Wu, Vernon Lawhern, Nicholas R., Waytowich, Yongcan Cao

PDF

Open Access

TL;DR

This paper investigates optimization techniques to enhance rating-based reinforcement learning (RbRL), focusing on hyperparameter tuning and providing guidelines for better performance in reward inference tasks.

Contribution

It offers new insights into hyperparameter effects on RbRL and proposes practical guidelines for optimizing its performance.

Findings

01

Hyperparameters significantly influence RbRL effectiveness.

02

Guidelines help in selecting optimal hyperparameters for RbRL.

03

Enhanced understanding of RbRL's sensitivity to various factors.

Abstract

This paper explores multiple optimization methods to improve the performance of rating-based reinforcement learning (RbRL). RbRL, a method based on the idea of human ratings, has been developed to infer reward functions in reward-free environments for the subsequent policy learning via standard reinforcement learning, which requires the availability of reward functions. Specifically, RbRL minimizes the cross entropy loss that quantifies the differences between human ratings and estimated ratings derived from the inferred reward. Hence, a low loss means a high degree of consistency between human ratings and estimated ratings. Despite its simple form, RbRL has various hyperparameters and can be sensitive to various factors. Therefore, it is critical to provide comprehensive experiments to understand the impact of various hyperparameters on the performance of RbRL. This paper is a work in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsElevator Systems and Control