Leveraging human Domain Knowledge to model an empirical Reward function   for a Reinforcement Learning problem

Dattaraj Rao

arXiv:1909.07116·cs.AI·September 17, 2019

Leveraging human Domain Knowledge to model an empirical Reward function for a Reinforcement Learning problem

Dattaraj Rao

PDF

Open Access

TL;DR

This paper introduces a novel method for creating RL environments by modeling reward functions from human domain knowledge, demonstrated through thermostat temperature control, reducing reliance on detailed physical models.

Contribution

It proposes an empirical approach to model reward functions from human rules, enabling RL training without exhaustive physical environment simulations.

Findings

01

Empirical reward modeling effectively guides RL training.

02

The approach simplifies environment creation for thermostat control.

03

Deep deterministic policy gradient (DDPG) successfully optimized temperature policies.

Abstract

Traditional Reinforcement Learning (RL) problems depend on an exhaustive simulation environment that models real-world physics of the problem and trains the RL agent by observing this environment. In this paper, we present a novel approach to creating an environment by modeling the reward function based on empirical rules extracted from human domain knowledge of the system under study. Using this empirical rewards function, we will build an environment and train the agent. We will first create an environment that emulates the effect of setting cabin temperature through thermostat. This is typically done in RL problems by creating an exhaustive model of the system with detailed thermodynamic study. Instead, we propose an empirical approach to model the reward function based on human domain knowledge. We will document some rules of thumb that we usually exercise as humans while setting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Energy Efficiency and Management