Dual-Gated Epistemic Time-Dilation: Autonomous Compute Modulation in Asynchronous MARL
Igor Jankowski

TL;DR
This paper introduces ETD-MAPPO, a novel asynchronous MARL method that autonomously modulates agent compute frequency based on uncertainty, significantly reducing computational costs while maintaining performance.
Contribution
It proposes a dual-gated epistemic trigger for autonomous compute modulation in MARL, enabling asynchronous execution and improved efficiency over traditional synchronous models.
Findings
Achieved over 60% relative improvement in acquisition over baseline models.
Prevented premature policy collapse in complex environments like Google Research Football.
Reduced computational overhead by 73.6% during off-ball execution without performance loss.
Abstract
While Multi-Agent Reinforcement Learning (MARL) algorithms achieve unprecedented successes across complex continuous domains, their standard deployment strictly adheres to a synchronous operational paradigm. Under this paradigm, agents are universally forced to execute deep neural network inferences at every micro-frame, regardless of immediate necessity. This dense throughput acts as a fundamental barrier to physical deployment on edge-devices where thermal and metabolic budgets are highly constrained. We propose Epistemic Time-Dilation MAPPO (ETD-MAPPO), augmented with a Dual-Gated Epistemic Trigger. Instead of depending on rigid frame-skipping (macro-actions), agents autonomously modulate their execution frequency by interpreting aleatoric uncertainty (via Shannon entropy of their policy) and epistemic uncertainty (via state-value divergence in a Twin-Critic architecture). To format…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Green IT and Sustainability · Embodied and Extended Cognition
