Bayesian Experience Reuse for Learning from Multiple Demonstrators

Michael Gimelfarb; Scott Sanner; Chi-Guhn Lee

arXiv:2006.05725·cs.LG·June 11, 2020

Bayesian Experience Reuse for Learning from Multiple Demonstrators

Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee

PDF

TL;DR

This paper introduces Bayesian Experience Reuse (BERS), a method that models uncertainty from multiple experts' demonstrations to improve learning efficiency and safety in new tasks, demonstrated on optimization and supply chain problems.

Contribution

It develops a Bayesian framework using normal-inverse-gamma priors and neural networks to model expert uncertainty, enabling safe and effective demonstration reuse in complex tasks.

Findings

01

BERS improves learning efficiency in static optimization.

02

Effective transfer in high-dimensional supply chain problems.

03

Modeling expert uncertainty enhances demonstration integration.

Abstract

Learning from demonstrations (LfD) improves the exploration efficiency of a learning agent by incorporating demonstrations from experts. However, demonstration data can often come from multiple experts with conflicting goals, making it difficult to incorporate safely and effectively in online settings. We address this problem in the static and dynamic optimization settings by modelling the uncertainty in source and target task functions using normal-inverse-gamma priors, whose corresponding posteriors are, respectively, learned from demonstrations and target data using Bayesian neural networks with shared features. We use this learned belief to derive a quadratic programming problem whose solution yields a probability distribution over the expert models. Finally, we propose Bayesian Experience Reuse (BERS) to sample demonstrations in accordance with this distribution and reuse them…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.