Beyond KL-divergence: Risk Aware Control Through Cross Entropy and Adversarial Entropy Regularization

Menno van Zutphen; Domagoj Herceg; Duarte J. Antunes

arXiv:2505.11068·eess.SY·May 19, 2025

Beyond KL-divergence: Risk Aware Control Through Cross Entropy and Adversarial Entropy Regularization

Menno van Zutphen, Domagoj Herceg, Duarte J. Antunes

PDF

Open Access

TL;DR

This paper develops a risk-aware control framework using cross entropy and entropy regularization to model adversarial disturbances, leading to an efficient dynamic programming algorithm with connections to $ ext{H}_ ext{ extinfty}$ control.

Contribution

It introduces a novel regularization-based approach for robust control that balances empirical data fidelity and adversarial uncertainty, extending traditional methods.

Findings

01

The minsoftmax algorithm efficiently computes robust control policies.

02

The framework generalizes $ ext{H}_ ext{ extinfty}$ control in Gaussian settings.

03

Numerical examples demonstrate improved robustness and flexibility.

Abstract

While the idea of robust dynamic programming (DP) is compelling for systems affected by uncertainty, addressing worst-case disturbances generally results in excessive conservatism. This paper introduces a method for constructing control policies robust to adversarial disturbance distributions that relate to a provided empirical distribution. The character of the adversary is shaped by a regularization term comprising a weighted sum of (i) the cross-entropy between the empirical and the adversarial distributions, and (ii) the entropy of the adversarial distribution itself. The regularization weights are interpreted as the likelihood factor and the temperature respectively. The proposed framework leads to an efficient DP-like algorithm -- referred to as the minsoftmax algorithm -- to obtain the optimal control policy, where the disturbances follow an analytical softmax distribution in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning

MethodsSoftmax