# Reward Potentials for Planning with Learned Neural Network Transition   Models

**Authors:** Buser Say, Scott Sanner, Sylvie Thi\'ebaux

arXiv: 1904.09366 · 2019-07-29

## TL;DR

This paper introduces a novel bilevel programming approach to compute reward potentials for learned neural network models, improving the efficiency of optimal planning in continuous spaces using MILP.

## Contribution

It presents a new finite-time constraint generation algorithm for finding optimal reward potentials, enhancing MILP relaxation for neural network-based planning.

## Key findings

- Efficient computation of reward potentials for learned NN models.
- Strengthening of MILP models improves planning over long horizons.
- Overhead of potential computation is justified by better planning performance.

## Abstract

Optimal planning with respect to learned neural network (NN) models in continuous action and state spaces using mixed-integer linear programming (MILP) is a challenging task for branch-and-bound solvers due to the poor linear relaxation of the underlying MILP model. For a given set of features, potential heuristics provide an efficient framework for computing bounds on cost (reward) functions. In this paper, we model the problem of finding optimal potential bounds for learned NN models as a bilevel program, and solve it using a novel finite-time constraint generation algorithm. We then strengthen the linear relaxation of the underlying MILP model by introducing constraints to bound the reward function based on the precomputed reward potentials. Experimentally, we show that our algorithm efficiently computes reward potentials for learned NN models, and that the overhead of computing reward potentials is justified by the overall strengthening of the underlying MILP model for the task of planning over long horizons.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.09366/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1904.09366/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/1904.09366/full.md

---
Source: https://tomesphere.com/paper/1904.09366