Navigating Trade-offs: Policy Summarization for Multi-Objective   Reinforcement Learning

Zuzanna Osika; Jazmin Zatarain-Salazar; Frans A. Oliehoek; Pradeep K.; Murukannaiah

arXiv:2411.04784·cs.AI·November 8, 2024

Navigating Trade-offs: Policy Summarization for Multi-Objective Reinforcement Learning

Zuzanna Osika, Jazmin Zatarain-Salazar, Frans A. Oliehoek, Pradeep K., Murukannaiah

PDF

1 Repo

TL;DR

This paper introduces a clustering method for multi-objective reinforcement learning solutions, helping decision makers understand trade-offs and policy behaviors more effectively than traditional clustering methods.

Contribution

The paper presents a novel clustering approach that considers both policy behavior and objective values to analyze MORL solution sets, improving interpretability.

Findings

01

Outperforms traditional k-medoids clustering in four environments

02

Reveals relationships between policy behaviors and objective space regions

03

Demonstrates practical application through a real-world case study

Abstract

Multi-objective reinforcement learning (MORL) is used to solve problems involving multiple objectives. An MORL agent must make decisions based on the diverse signals provided by distinct reward functions. Training an MORL agent yields a set of solutions (policies), each presenting distinct trade-offs among the objectives (expected returns). MORL enhances explainability by enabling fine-grained comparisons of policies in the solution set based on their trade-offs as opposed to having a single policy. However, the solution set is typically large and multi-dimensional, where each policy (e.g., a neural network) is represented by its objective values. We propose an approach for clustering the solution set generated by MORL. By considering both policy behavior and objective values, our clustering method can reveal the relationship between policy behaviors and regions in the objective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

osikazuzanna/bi-objective-clustering
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSparse Evolutionary Training