On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning
Bolin Gao, Lacra Pavel

TL;DR
This paper explores the mathematical properties of the softmax function using convex analysis and monotone operator theory, revealing its connection to the log-sum-exp function and implications for reinforcement learning.
Contribution
It establishes that softmax is the monotone gradient map of the log-sum-exp function and analyzes how the inverse temperature affects its Lipschitz and co-coercivity properties.
Findings
Softmax is the monotone gradient map of the log-sum-exp function.
Inverse temperature influences Lipschitz and co-coercivity properties.
Application demonstrated in game-theoretic reinforcement learning.
Abstract
In this paper, we utilize results from convex analysis and monotone operator theory to derive additional properties of the softmax function that have not yet been covered in the existing literature. In particular, we show that the softmax function is the monotone gradient map of the log-sum-exp function. By exploiting this connection, we show that the inverse temperature parameter determines the Lipschitz and co-coercivity properties of the softmax function. We then demonstrate the usefulness of these properties through an application in game-theoretic reinforcement learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical and Theoretical Epidemiology and Ecology Models · Evolutionary Game Theory and Cooperation · Reinforcement Learning in Robotics
MethodsSoftmax
