# Reinforcement Learning with Non-uniform State Representations for   Adaptive Search

**Authors:** Sandeep Manjanna, Herke van Hoof, Gregory Dudek

arXiv: 1906.06588 · 2019-06-18

## TL;DR

This paper introduces a reinforcement learning-based search algorithm that optimizes spatial exploration by non-uniform sampling and state aggregation, enabling rapid target detection in search and rescue missions.

## Contribution

It presents a novel non-uniform state aggregation method for policy search in reinforcement learning applied to robotic search tasks.

## Key findings

- Efficient trajectories reduce search time for lost targets.
- The algorithm adapts to new probability distributions using learned parameters.
- Non-uniform sampling improves coverage of high-probability regions.

## Abstract

Efficient spatial exploration is a key aspect of search and rescue. In this paper, we present a search algorithm that generates efficient trajectories that optimize the rate at which probability mass is covered by a searcher. This should allow an autonomous vehicle find one or more lost targets as rapidly as possible. We do this by performing non-uniform sampling of the search region. The path generated minimizes the expected time to locate the missing target by visiting high probability regions using non-myopic path generation based on reinforcement learning. We model the target probability distribution using a classic mixture of Gaussians model with means and mixture coefficients tuned according to the location and time of sightings of the lost target. Key features of our search algorithm are the ability to employ a very general non-deterministic action model and the ability to generate action plans for any new probability distribution using the parameters learned on other similar looking distributions. One of the key contributions of this paper is the use of non-uniform state aggregation for policy search in the context of robotics.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.06588/full.md

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/1906.06588/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1906.06588/full.md

---
Source: https://tomesphere.com/paper/1906.06588