Factored Reasoning with Inner Speech and Persistent Memory for Evidence-Grounded Human-Robot Interaction

Valerio Belcamino; Mariya Kilina; Alessandro Carf\`i; Valeria Seidita; Fulvio Mastrogiovanni; Antonio Chella

arXiv:2602.00675·cs.RO·February 3, 2026

Factored Reasoning with Inner Speech and Persistent Memory for Evidence-Grounded Human-Robot Interaction

Valerio Belcamino, Mariya Kilina, Alessandro Carf\`i, Valeria Seidita, Fulvio Mastrogiovanni, Antonio Chella

PDF

Open Access

TL;DR

This paper introduces JANUS, a cognitive architecture for human-robot interaction that employs factored reasoning, inner speech, and persistent memory to maintain context, verify decisions, and ground responses in external evidence.

Contribution

The paper presents JANUS, a novel cognitive architecture that models interaction as a factored POMDP with explicit modules and policies, enhancing evidence-grounded, verifiable robot assistance.

Findings

01

High agreement with curated references in dietary assistance domain

02

Effective management of persistent context and evidence grounding

03

Low latency in module-level unit tests

Abstract

Dialogue-based human-robot interaction requires robot cognitive assistants to maintain persistent user context, recover from underspecified requests, and ground responses in external evidence, while keeping intermediate decisions verifiable. In this paper we introduce JANUS, a cognitive architecture for assistive robots that models interaction as a partially observable Markov decision process and realizes control as a factored controller with typed interfaces. To this aim, Janus (i) decomposes the overall behavior into specialized modules, related to scope detection, intent recognition, memory, inner speech, query generation, and outer speech, and (ii) exposes explicit policies for information sufficiency, execution readiness, and tool grounding. A dedicated memory agent maintains a bounded recent-history buffer, a compact core memory, and an archival store with semantic retrieval,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSocial Robot Interaction and HRI · Speech and dialogue systems · Multimodal Machine Learning Applications