# Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for   Real-World Predictive Knowledge Architectures

**Authors:** Johannes G\"unther, Nadia M. Ady, Alex Kearney, Michael R. Dawson,, Patrick M. Pilarski

arXiv: 1908.05751 · 2020-03-05

## TL;DR

This paper investigates the use of TIDBD, an online step-size adaptation method, in robotic prediction tasks, demonstrating its effectiveness and robustness in sensor-rich environments and its potential for autonomous learning.

## Contribution

It introduces TIDBD as a practical online step-size adaptation method for predictive learning in robotics, enabling simultaneous representation and parameter learning.

## Key findings

- TIDBD performs comparably to classic TD in prediction accuracy.
- TIDBD can automatically detect and characterize sensor failures.
- Step-size adaptation enhances robustness in robotic learning environments.

## Abstract

Predictions and predictive knowledge have seen recent success in improving not only robot control but also other applications ranging from industrial process control to rehabilitation. A property that makes these predictive approaches well suited for robotics is that they can be learned online and incrementally through interaction with the environment. However, a remaining challenge for many prediction-learning approaches is an appropriate choice of prediction-learning parameters, especially parameters that control the magnitude of a learning machine's updates to its predictions (the learning rate or step size). To begin to address this challenge, we examine the use of online step-size adaptation using a sensor-rich robotic arm. Our method of choice, Temporal-Difference Incremental Delta-Bar-Delta (TIDBD), learns and adapts step sizes on a feature level; importantly, TIDBD allows step-size tuning and representation learning to occur at the same time. We show that TIDBD is a practical alternative for classic Temporal-Difference (TD) learning via an extensive parameter search. Both approaches perform comparably in terms of predicting future aspects of a robotic data stream. Furthermore, the use of a step-size adaptation method like TIDBD appears to allow a system to automatically detect and characterize common sensor failures in a robotic application. Together, these results promise to improve the ability of robotic devices to learn from interactions with their environments in a robust way, providing key capabilities for autonomous agents and robots.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.05751/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1908.05751/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1908.05751/full.md

---
Source: https://tomesphere.com/paper/1908.05751