# Reinforcement Learning Scheduler for Vehicle-to-Vehicle Communications   Outside Coverage

**Authors:** Taylan \c{S}ahin, Ramin Khalili, Mate Boban, Adam Wolisz

arXiv: 1904.12653 · 2019-04-30

## TL;DR

This paper explores using reinforcement learning to develop a centralized scheduler for vehicle-to-vehicle communications, aiming to improve resource allocation during coverage outages, with promising initial results.

## Contribution

It introduces an RL-based centralized scheduler for V2V communication that pre-assigns resources for out-of-coverage scenarios, outperforming traditional distributed methods.

## Key findings

- RL scheduler achieves comparable or better performance than state-of-the-art distributed schedulers.
- The training process converges within a few thousand epochs.
- The RL approach is effective for intermittent network coverage scenarios.

## Abstract

Radio resources in vehicle-to-vehicle (V2V) communication can be scheduled either by a centralized scheduler residing in the network (e.g., a base station in case of cellular systems) or a distributed scheduler, where the resources are autonomously selected by the vehicles. The former approach yields a considerably higher resource utilization in case the network coverage is uninterrupted. However, in case of intermittent or out-of-coverage, due to not having input from centralized scheduler, vehicles need to revert to distributed scheduling. Motivated by recent advances in reinforcement learning (RL), we investigate whether a centralized learning scheduler can be taught to efficiently pre-assign the resources to vehicles for out-of-coverage V2V communication. Specifically, we use the actor-critic RL algorithm to train the centralized scheduler to provide non-interfering resources to vehicles before they enter the out-of-coverage area. Our initial results show that a RL-based scheduler can achieve performance as good as or better than the state-of-art distributed scheduler, often outperforming it. Furthermore, the learning process completes within a reasonable time (ranging from a few hundred to a few thousand epochs), thus making the RL-based scheduler a promising solution for V2V communications with intermittent network coverage.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.12653/full.md

## Figures

17 figures with captions in the complete paper: https://tomesphere.com/paper/1904.12653/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1904.12653/full.md

---
Source: https://tomesphere.com/paper/1904.12653