TRAM: Test-Time Risk Adaptation with Mixture of Agents

Mohamad Fares El Hajj Chehade; Amrit Singh Bedi; Amy Zhang; Hao Zhu

arXiv:2408.08812·cs.LG·May 21, 2026

TRAM: Test-Time Risk Adaptation with Mixture of Agents

Mohamad Fares El Hajj Chehade, Amrit Singh Bedi, Amy Zhang, Hao Zhu

PDF

TL;DR

TRAM enables reinforcement learning agents to adapt to new safety constraints at deployment time by intelligently combining pre-trained policies based on risk assessments, without additional training.

Contribution

The paper introduces TRAM, a novel method for zero-update deployment-time risk adaptation that selects actions from source policies using risk-adjusted scores, supporting various risk types.

Findings

01

TRAM reduces deployment risk across multiple environments.

02

TRAM maintains reward performance while adapting to new safety constraints.

03

TRAM does not require parameter updates during deployment.

Abstract

Deployed reinforcement learning agents often face safety requirements that are specified only after training, such as new hazard maps, revised risk thresholds, or behavioral alignment constraints. We study zero-update deployment-time adaptation, where a fixed library of risk-neutral source policies is reused under a newly specified reward-risk tradeoff. We propose TRAM (Test-Time Risk Adaptation via Mixture of Agents), a source-scored composition rule that evaluates each source policy under the target reward and an occupancy-based deployment risk, then selects actions using risk-adjusted source scores. Unlike training-time risk-sensitive methods tied to a fixed surrogate such as return variance, TRAM supports spatial barrier exposure, divergence from a reference behavior, and local volatility risks specified at test time. We explicitly characterize TRAM as a surrogate method: it does…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Smart Grid Security and Resilience · Reinforcement Learning in Robotics