Rejecting Hallucinated State Targets during Planning

Mingde Zhao; Tristan Sylvain; Romain Laroche; Doina Precup; Yoshua Bengio

arXiv:2410.07096·cs.AI·August 12, 2025

Rejecting Hallucinated State Targets during Planning

Mingde Zhao, Tristan Sylvain, Romain Laroche, Doina Precup, Yoshua Bengio

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to identify and reject infeasible, hallucinated targets in planning agents, reducing delusional behaviors and improving performance without altering the original agent or its generator.

Contribution

It proposes a target feasibility evaluator trained with a novel combination of techniques to robustly filter infeasible targets during planning.

Findings

01

Significant reduction in delusional behaviors

02

Performance improvements across various agents

03

Effective identification of infeasible targets

Abstract

In planning processes of computational decision-making agents, generative or predictive models are often used as "generators" to propose "targets" representing sets of expected or desirable states. Unfortunately, learned models inevitably hallucinate infeasible targets that can cause delusional behaviors and safety concerns. We first investigate the kinds of infeasible targets that generators can hallucinate. Then, we devise a strategy to identify and reject infeasible targets by learning a target feasibility evaluator. To ensure that the evaluator is robust and non-delusional, we adopted a design choice combining off-policy compatible learning rule, distributional architecture, and data augmentation based on hindsight relabeling. Attaching to a planning agent, the designed evaluator learns by observing the agent's interactions with the environment and the targets produced by its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mila-iqia/delusions
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDeception detection and forensic psychology · Criminal Justice and Corrections Analysis · Psychopathy, Forensic Psychiatry, Sexual Offending