HiVAE: Hierarchical Latent Variables for Scalable Theory of Mind

Nigel Doering; Rahath Malladi; Arshia Sangwan; David Danks; Tauhidur Rahman

arXiv:2602.16826·cs.LG·February 20, 2026

HiVAE: Hierarchical Latent Variables for Scalable Theory of Mind

Nigel Doering, Rahath Malladi, Arshia Sangwan, David Danks, Tauhidur Rahman

PDF

Open Access

TL;DR

HiVAE introduces a hierarchical variational autoencoder architecture that scales theory of mind reasoning to complex, realistic environments, achieving significant performance improvements but facing challenges in grounding latent representations to actual mental states.

Contribution

The paper presents HiVAE, a novel hierarchical VAE model inspired by human cognition, enabling scalable theory of mind reasoning in complex domains.

Findings

01

Achieves performance improvements on a large campus navigation task.

02

Hierarchical structure enhances prediction accuracy.

03

Latent representations lack explicit grounding to mental states.

Abstract

Theory of mind (ToM) enables AI systems to infer agents' hidden goals and mental states, but existing approaches focus mainly on small human understandable gridworld spaces. We introduce HiVAE, a hierarchical variational architecture that scales ToM reasoning to realistic spatiotemporal domains. Inspired by the belief-desire-intention structure of human cognition, our three-level VAE hierarchy achieves substantial performance improvements on a 3,185-node campus navigation task. However, we identify a critical limitation: while our hierarchical structure improves prediction, learned latent representations lack explicit grounding to actual mental states. We propose self-supervised alignment strategies and present this work to solicit community feedback on grounding approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsChild and Animal Learning Development · Embodied and Extended Cognition · Multimodal Machine Learning Applications