MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference
Rapha\"el Baur, Yannick Metz, Maria Gkoulta, Mennatallah El-Assady, Giorgia Ramponi, Thomas Kleine Buening

TL;DR
This paper introduces MAVRL, a Bayesian framework that jointly learns reward functions from multiple heterogeneous feedback types using amortized variational inference, improving robustness and interpretability in reinforcement learning.
Contribution
It proposes a scalable variational inference method that models multiple feedback types without manual loss balancing, enhancing reward learning from diverse signals.
Findings
Jointly inferred reward posteriors outperform single-type baselines
Exploiting multiple feedback types improves policy robustness
Reward uncertainty offers interpretable confidence signals
Abstract
Reward learning typically relies on a single feedback type or combines multiple feedback types using manually weighted loss terms. Currently, it remains unclear how to jointly learn reward functions from heterogeneous feedback types such as demonstrations, comparisons, ratings, and stops that provide qualitatively different signals. We address this challenge by formulating reward learning from multiple feedback types as Bayesian inference over a shared latent reward function, where each feedback type contributes information through an explicit likelihood. We introduce a scalable amortized variational inference approach that learns a shared reward encoder and feedback-specific likelihood decoders and is trained by optimizing a single evidence lower bound. Our approach avoids reducing feedback to a common intermediate representation and eliminates the need for manual loss balancing.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning
