Context as Prior: Bayesian-Inspired Intent Inference for Non-Speaking Agents with a Household Cat Testbed

Wenqian Zhang; Zehao Wang

arXiv:2604.27445·cs.CV·May 1, 2026

Context as Prior: Bayesian-Inspired Intent Inference for Non-Speaking Agents with a Household Cat Testbed

Wenqian Zhang, Zehao Wang

PDF

TL;DR

This paper introduces CatSignal, a Bayesian-inspired framework for inferring intent in non-speaking agents like household cats, effectively integrating context and behavior to improve accuracy and reduce shortcut errors.

Contribution

It presents a novel context-gated Product-of-Experts model for intent inference, demonstrating superior performance in a household cat dataset compared to traditional fusion methods.

Findings

01

Achieved 77.72% accuracy in intent inference on a domestic cat dataset.

02

Reduced shortcut failures caused by context ambiguity.

03

Outperformed feature concatenation and late-fusion baselines.

Abstract

Many agents in real-world environments cannot reliably communicate their goals through language, including household pets, pre-verbal infants, and other non-speaking embodied agents. In such settings, intent must be inferred from incomplete behavioral observations in context-rich environments. This creates a core ambiguity: observable behavior is often noisy or underspecified, while context provides strong prior information but can also induce brittle shortcut predictions if used naively. We present CatSignal, a Bayesian-inspired probabilistic framework for multimodal intent inference that models spatial context as a prior-like constraint and behavioral observations as evidence. Rather than treating context as an ordinary input feature, our method uses a context-gated Product-of-Experts formulation to compute posterior-like intent distributions from context, pose dynamics, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.