# Optimizing thermodynamic trajectories using evolutionary and   gradient-based reinforcement learning

**Authors:** Chris Beeler, Uladzimir Yahorau, Rory Coles, Kyle Mills, Stephen, Whitelam, and Isaac Tamblyn

arXiv: 1903.08543 · 2021-12-21

## TL;DR

This paper demonstrates that neural network-based reinforcement learning, both evolutionary and gradient-based, can discover and optimize thermodynamic cycles for maximum efficiency, including previously unknown cycles.

## Contribution

It introduces reinforcement learning methods to optimize thermodynamic trajectories, discovering known and novel cycles, bridging AI techniques with thermodynamics.

## Key findings

- Evolutionary RL finds the optimal Carnot cycle.
- Gradient-based RL learns the Stirling cycle.
- The methods can discover new thermodynamic cycles.

## Abstract

Using a model heat engine, we show that neural network-based reinforcement learning can identify thermodynamic trajectories of maximal efficiency. We consider both gradient and gradient-free reinforcement learning. We use an evolutionary learning algorithm to evolve a population of neural networks, subject to a directive to maximize the efficiency of a trajectory composed of a set of elementary thermodynamic processes; the resulting networks learn to carry out the maximally-efficient Carnot, Stirling, or Otto cycles. When given an additional irreversible process, this evolutionary scheme learns a previously unknown thermodynamic cycle. Gradient-based reinforcement learning is able to learn the Stirling cycle, whereas an evolutionary approach achieves the optimal Carnot cycle. Our results show how the reinforcement learning strategies developed for game playing can be applied to solve physical problems conditioned upon path-extensive order parameters.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.08543/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1903.08543/full.md

## References

52 references — full list in the complete paper: https://tomesphere.com/paper/1903.08543/full.md

---
Source: https://tomesphere.com/paper/1903.08543