# What we learn from the learning rate

**Authors:** Rory A. Brittain, Nick S. Jones, Thomas E. Ouldridge

arXiv: 1702.06041 · 2017-07-04

## TL;DR

This paper investigates the learning rate in bipartite Markov chains, clarifying its physical meaning and how it differs from mutual information, especially in complex systems, questioning its effectiveness as a sensor performance metric.

## Contribution

The study analyzes the behavior of the learning rate in simple and complex systems, highlighting its differences from mutual information and its limitations as a sensor performance measure.

## Key findings

- In simple systems, learning rate and mutual information are nearly equivalent.
- In complex steady-state systems, they behave differently, with the learning rate reflecting update frequency.
- Optimizing the learning rate can lead to sub-optimal sensor performance.

## Abstract

The learning rate is an information-theoretical quantity for bipartite Markov chains describing two coupled subsystems. It is defined as the rate at which transitions in the downstream subsystem tend to increase the mutual information between the two subsystems, and is bounded by the dissipation arising from these transitions. Its physical interpretation, however, is unclear, although it has been used as a metric for the sensing performance of the downstream subsystem. In this paper, we explore the behaviour of the learning rate for a number of simple model systems, establishing when and how its behaviour is distinct from the instantaneous mutual information between subsystems. In the simplest case, the two are almost equivalent. In more complex steady-state systems, the mutual information and the learning rate behave qualitatively distinctly, with the learning rate clearly now reflecting the rate at which the downstream system must update its information in response to changes in the upstream system. It is not clear whether this quantity is the most natural measure for sensor performance, and, indeed, we provide an example in which optimising the learning rate over a region of parameter space of the downstream system yields an apparently sub-optimal sensor.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.06041/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1702.06041/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1702.06041/full.md

---
Source: https://tomesphere.com/paper/1702.06041