# RSS-Based Q-Learning for Indoor UAV Navigation

**Authors:** Md Moin Uddin Chowdhury, Fatih Erden, and Ismail Guvenc

arXiv: 1905.13406 · 2019-06-03

## TL;DR

This paper introduces an RSS-based Q-learning approach for indoor UAV navigation in search and rescue missions, enabling UAVs to locate signal sources without GPS or prior environment knowledge.

## Contribution

It proposes a novel RSS-based state and reward definition for Q-learning, eliminating the need for GPS and environment maps in indoor UAV navigation.

## Key findings

- RSS-based Q-learning achieves comparable performance to location-based methods.
- The approach works effectively in simulated indoor environments.
- It enables GPS-denied indoor UAV navigation for SAR missions.

## Abstract

In this paper, we focus on the potential use of unmanned aerial vehicles (UAVs) for search and rescue (SAR) missions in GPS-denied indoor environments. We consider the problem of navigating a UAV to a wireless signal source, e.g., a smartphone or watch owned by a victim. We assume that the source periodically transmits RF signals to nearby wireless access points. Received signal strength (RSS) at the UAV, which is a function of the UAV and source positions, is fed to a Q-learning algorithm and the UAV is navigated to the vicinity of the source. Unlike the traditional location-based Q-learning approach that uses the GPS coordinates of the agent, our method uses the RSS to define the states and rewards of the algorithm. It does not require any a priori information about the environment. These, in turn, make it possible to use the UAVs in indoor SAR operations. Two indoor scenarios with different dimensions are created using a ray tracing software. Then, the corresponding heat maps that show the RSS at each possible UAV location are extracted for more realistic analysis. Performance of the RSS-based Q-learning algorithm is compared with the baseline (location-based) Q-learning algorithm in terms of convergence speed, average number of steps per episode, and the total length of the final trajectory. Our results show that the RSS-based Q-learning provides competitive performance with the location-based Q-learning.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.13406/full.md

## Figures

21 figures with captions in the complete paper: https://tomesphere.com/paper/1905.13406/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1905.13406/full.md

---
Source: https://tomesphere.com/paper/1905.13406