StratFormer: Adaptive Opponent Modeling and Exploitation in Imperfect-Information Games

Andy Caen; Mark H.M. Winands; Dennis J.N.J. Soemers

arXiv:2604.25796·cs.AI·April 29, 2026

StratFormer: Adaptive Opponent Modeling and Exploitation in Imperfect-Information Games

Andy Caen, Mark H.M. Winands, Dennis J.N.J. Soemers

PDF

TL;DR

StratFormer is a transformer-based meta-agent that learns to model and exploit opponents in imperfect-information games through a two-phase curriculum, achieving significant exploitability gains while maintaining safety.

Contribution

The paper introduces a novel transformer architecture with dual-turn tokens and a two-phase training curriculum for opponent modeling and exploitation in imperfect-information games.

Findings

01

Achieves +0.106 BB per hand average exploitation gain on Leduc Hold'em.

02

Reaches peak gains of +0.821 BB against highly exploitable opponents.

03

Maintains near-equilibrium safety during exploitation.

Abstract

We present StratFormer, a transformer-based meta-agent that learns to simultaneously model and exploit opponents in imperfect-information games through a two-phase curriculum. The first phase trains an opponent modeling head to identify behavioral patterns from action histories while the agent plays a game-theoretic optimal (GTO) policy. The second phase progressively shifts the policy toward best-response (BR) exploitation, guided by a per-opponent regularization schedule tied to exploitability. Our architecture introduces dual-turn tokens -- feature vectors constructed at both agent and opponent decision points -- coupled with bucket-rate features that encode opponent tendencies across five strategic contexts. On Leduc Hold'em, a small poker variant with six cards and two betting rounds, we test against six opponent archetypes at two strength levels each, with exploitability ranging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.