On the Properties of the Softmax Function with Application in Game   Theory and Reinforcement Learning

Bolin Gao; Lacra Pavel

arXiv:1704.00805·math.OC·August 23, 2018·212 cites

On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning

Bolin Gao, Lacra Pavel

PDF

Open Access

TL;DR

This paper explores the mathematical properties of the softmax function using convex analysis and monotone operator theory, revealing its connection to the log-sum-exp function and implications for reinforcement learning.

Contribution

It establishes that softmax is the monotone gradient map of the log-sum-exp function and analyzes how the inverse temperature affects its Lipschitz and co-coercivity properties.

Findings

01

Softmax is the monotone gradient map of the log-sum-exp function.

02

Inverse temperature influences Lipschitz and co-coercivity properties.

03

Application demonstrated in game-theoretic reinforcement learning.

Abstract

In this paper, we utilize results from convex analysis and monotone operator theory to derive additional properties of the softmax function that have not yet been covered in the existing literature. In particular, we show that the softmax function is the monotone gradient map of the log-sum-exp function. By exploiting this connection, we show that the inverse temperature parameter determines the Lipschitz and co-coercivity properties of the softmax function. We then demonstrate the usefulness of these properties through an application in game-theoretic reinforcement learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematical and Theoretical Epidemiology and Ecology Models · Evolutionary Game Theory and Cooperation · Reinforcement Learning in Robotics

MethodsSoftmax