Rational Inverse Reasoning
Ben Zandonati, Tom\'as Lozano-P\'erez, Leslie Pack Kaelbling

TL;DR
Rational Inverse Reasoning (RIR) is a hierarchical Bayesian framework that enables robots to infer structured, executable programs from minimal demonstrations, significantly improving one-shot and few-shot generalization in manipulation tasks.
Contribution
This work introduces RIR, a novel hierarchical Bayesian model that combines vision-language reasoning with program induction for improved robot generalization from limited data.
Findings
RIR outperforms state-of-the-art baselines in manipulation tasks.
RIR successfully generalizes from a single demonstration to novel object configurations.
RIR infers structured task programs that capture high-level goals and constraints.
Abstract
Humans can observe a single, imperfect demonstration and immediately generalize to very different problem settings. Robots, in contrast, often require hundreds of examples and still struggle to generalize beyond the training conditions. We argue that this limitation arises from the inability to recover the latent explanations that underpin intelligent behavior, and that these explanations can take the form of structured programs consisting of high-level goals, sub-task decomposition, and execution constraints. In this work, we introduce Rational Inverse Reasoning (RIR), a framework for inferring these latent programs through a hierarchical generative model of behavior. RIR frames few-shot imitation as Bayesian program induction: a vision-language model iteratively proposes structured symbolic task hypotheses, while a planner-in-the-loop inference scheme scores each by the likelihood of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications
