# A Reinforcement Learning Perspective on the Optimal Control of Mutation   Probabilities for the (1+1) Evolutionary Algorithm: First Results on the   OneMax Problem

**Authors:** Luca Mossina, Emmanuel Rachelson, Daniel Delahaye

arXiv: 1905.03726 · 2019-05-10

## TL;DR

This paper explores using Reinforcement Learning to dynamically control mutation probabilities in a (1+1) evolutionary algorithm on the OneMax problem, demonstrating how RL can optimize algorithm parameters without prior knowledge of transition probabilities.

## Contribution

It introduces a novel RL-based approach for parameter control in evolutionary algorithms, combining model-based and model-free methods to improve optimization performance.

## Key findings

- RL can effectively optimize mutation probabilities in evolutionary algorithms.
- Q-Learning approach does not require explicit transition probabilities.
- Method allows integration of expert knowledge into parameter control.

## Abstract

We study how Reinforcement Learning can be employed to optimally control parameters in evolutionary algorithms. We control the mutation probability of a (1+1) evolutionary algorithm on the OneMax function. This problem is modeled as a Markov Decision Process and solved with Value Iteration via the known transition probabilities. It is then solved via Q-Learning, a Reinforcement Learning algorithm, where the exact transition probabilities are not needed. This approach also allows previous expert or empirical knowledge to be included into learning. It opens new perspectives, both formally and computationally, for the problem of parameter control in optimization.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.03726/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1905.03726/full.md

## References

11 references — full list in the complete paper: https://tomesphere.com/paper/1905.03726/full.md

---
Source: https://tomesphere.com/paper/1905.03726