Learning to Be Cautious

Montaser Mohammedalamen; Dustin Morrill; Alexander Sieusahai; Yash Satsangi; Michael Bowling

arXiv:2110.15907·cs.AI·October 14, 2025·1 cites

Learning to Be Cautious

Montaser Mohammedalamen, Dustin Morrill, Alexander Sieusahai, Yash Satsangi, Michael Bowling

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning algorithm that enables agents to learn cautious behaviors in novel situations by modeling reward uncertainty with neural network ensembles, eliminating the need for task-specific safety tuning.

Contribution

It presents a novel approach that learns cautious policies through reward function uncertainty without relying on explicit safety information or task-specific tuning.

Findings

01

Agents learn cautious behaviors in complex tasks.

02

The approach outperforms safety-tuned baselines.

03

Caution emerges without explicit safety constraints.

Abstract

A key challenge in the field of reinforcement learning is to develop agents that behave cautiously in novel situations. It is generally impossible to anticipate all situations that an autonomous system may face or what behavior would best avoid bad outcomes. An agent that can learn to be cautious would overcome this challenge by discovering for itself when and how to behave cautiously. In contrast, current approaches typically embed task-specific safety information or explicit cautious behaviors into the system, which is error-prone and imposes extra burdens on practitioners. In this paper, we present both a sequence of tasks where cautious behavior becomes increasingly non-obvious, as well as an algorithm to demonstrate that it is possible for a system to learn to be cautious. The essential features of our algorithm are that it characterizes reward function uncertainty without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)