# Interactive Learning of Environment Dynamics for Sequential Tasks

**Authors:** Robert Loftin, Bei Peng, Matthew E. Taylor, Michael L. Littman and, David L. Roberts

arXiv: 1907.08478 · 2019-07-22

## TL;DR

This paper introduces BAM, an algorithm that enables agents to learn environment dynamics more effectively by incorporating human teacher knowledge, improving performance in sequential tasks through combined demonstrations and feedback.

## Contribution

The paper presents BAM, a novel algorithm that integrates human knowledge into environment models, enhancing learning efficiency for agents in sequential tasks.

## Key findings

- BAM outperforms methods without explicit dynamics modeling.
- Effective learning from both simulation and real human teachers.
- Improved task performance with combined demonstrations and feedback.

## Abstract

In order for robots and other artificial agents to efficiently learn to perform useful tasks defined by an end user, they must understand not only the goals of those tasks, but also the structure and dynamics of that user's environment. While existing work has looked at how the goals of a task can be inferred from a human teacher, the agent is often left to learn about the environment on its own. To address this limitation, we develop an algorithm, Behavior Aware Modeling (BAM), which incorporates a teacher's knowledge into a model of the transition dynamics of an agent's environment. We evaluate BAM both in simulation and with real human teachers, learning from a combination of task demonstrations and evaluative feedback, and show that it can outperform approaches which do not explicitly consider this source of dynamics knowledge.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.08478/full.md

## Figures

26 figures with captions in the complete paper: https://tomesphere.com/paper/1907.08478/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/1907.08478/full.md

---
Source: https://tomesphere.com/paper/1907.08478