# Adaptive querying for reward learning from human feedback

**Authors:** Yashwanthi Anand, Nnamdi Nwagwu, Kevin Sabbe, Naomi T. Fitter, Sandhya Saisubramanian

PMC · DOI: 10.3389/frobt.2025.1734564 · Frontiers in Robotics and AI · 2026-02-12

## TL;DR

This paper introduces a method for robots to learn from human feedback by adaptively choosing when and how to ask for input, improving learning efficiency and safety.

## Contribution

The novelty is an adaptive feedback selection method that optimizes both query states and feedback formats to accelerate learning.

## Key findings

- The approach improves sample efficiency in learning to avoid unsafe behaviors in simulations.
- User studies with a physical robot show the method effectively gathers informative feedback aligned with user preferences.
- Adaptive feedback selection accounts for feedback cost and probability, enhancing practical usability.

## Abstract

Learning from human feedback is a popular approach to train robots to adapt to user preferences and improve safety. Existing approaches typically consider a single querying (interaction) format when seeking human feedback and do not leverage multiple modes of user interaction with a robot. We examine how to learn a penalty function associated with unsafe behaviors using multiple forms of human feedback, by optimizing both the query state and feedback format. Our proposed adaptive feedback selection is an iterative, two-phase approach which first selects critical states for querying, and then uses information gain to select a feedback format for querying across the sampled critical states. The feedback format selection also accounts for the cost and probability of receiving feedback in a certain format. Our experiments in simulation demonstrate the sample efficiency of our approach in learning to avoid undesirable behaviors. The results of our user study with a physical robot highlight the practicality and effectiveness of adaptive feedback selection in seeking informative, user-aligned feedback that accelerate learning. Experiment videos, code and supplementary materials are found on our website: https://tinyurl.com/AFS-learning.

## Full-text entities

- **Genes:** ENO2 (enolase 2) [NCBI Gene 2026] {aka HEL-S-279, NSE}
- **Diseases:** AFS (MESH:D018489), MDP (MESH:D020195), NSEs (MESH:D064420)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12935605/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12935605/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/PMC12935605/full.md

---
Source: https://tomesphere.com/paper/PMC12935605