Addressing the issue of stochastic environments and local   decision-making in multi-objective reinforcement learning

Kewen Ding

arXiv:2211.08669·cs.LG·November 17, 2022

Addressing the issue of stochastic environments and local decision-making in multi-objective reinforcement learning

Kewen Ding

PDF

Open Access

TL;DR

This paper investigates how stochastic environments and local decision-making affect the ability of multi-objective reinforcement learning algorithms to learn optimal policies, highlighting the impact of noisy Q-value estimates and proposing improvements.

Contribution

It identifies the influence of environment stochasticity and utility functions on MORL Q-learning performance and evaluates variants incorporating global statistics and option learning.

Findings

01

Reward signal design improves baseline performance

02

Global statistics variant outperforms baseline but is not fully effective

03

Option learning guarantees convergence but lacks scalability

Abstract

Multi-objective reinforcement learning (MORL) is a relatively new field which builds on conventional Reinforcement Learning (RL) to solve multi-objective problems. One of common algorithm is to extend scalar value Q-learning by using vector Q values in combination with a utility function, which captures the user's preference for action selection. This study follows on prior works, and focuses on what factors influence the frequency with which value-based MORL Q-learning algorithms learn the optimal policy for an environment with stochastic state transitions in scenarios where the goal is to maximise the Scalarised Expected Return (SER) - that is, to maximise the average outcome over multiple runs rather than the outcome within each individual episode. The analysis of the interaction between stochastic environment and MORL Q-learning algorithms run on a simple Multi-objective Markov…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInnovation Diffusion and Forecasting · Energy Efficiency and Management · Reinforcement Learning in Robotics

MethodsQ-Learning