Identity Concealment Games: How I Learned to Stop Revealing and Love the   Coincidences

Mustafa O. Karabag; Melkior Ornik; Ufuk Topcu

arXiv:2105.05377·cs.GT·March 5, 2024·1 cites

Identity Concealment Games: How I Learned to Stop Revealing and Love the Coincidences

Mustafa O. Karabag, Melkior Ornik, Ufuk Topcu

PDF

Open Access

TL;DR

This paper introduces identity concealment games, a new framework for modeling strategic concealment of identity in adversarial stochastic environments, and provides algorithms for optimal policy synthesis and learning.

Contribution

It defines the concept of identity concealment games, proves the existence of equilibrium policies, and develops a provably effective learning algorithm for hostile players.

Findings

01

Existence of equilibrium policy pairs in identity concealment games.

02

Optimality equations for policy synthesis.

03

An algorithm with sample complexity bounds for learning near-optimal policies.

Abstract

In an adversarial environment, a hostile player performing a task may behave like a non-hostile one in order not to reveal its identity to an opponent. To model such a scenario, we define identity concealment games: zero-sum stochastic reachability games with a zero-sum objective of identity concealment. To measure the identity concealment of the player, we introduce the notion of an average player. The average player's policy represents the expected behavior of a non-hostile player. We show that there exists an equilibrium policy pair for every identity concealment game and give the optimality equations to synthesize an equilibrium policy pair. If the player's opponent follows a non-equilibrium policy, the player can hide its identity better. For this reason, we study how the hostile player may learn the opponent's policy. Since learning via exploration policies would quickly reveal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Optimization and Search Problems