OOWM: Structuring Embodied Reasoning and Planning via Object-Oriented Programmatic World Modeling

Hongyu Chen; Liang Lin; Guangrun Wang

arXiv:2604.09580·cs.AI·April 14, 2026

OOWM: Structuring Embodied Reasoning and Planning via Object-Oriented Programmatic World Modeling

Hongyu Chen, Liang Lin, Guangrun Wang

PDF

TL;DR

OOWM introduces an object-oriented, symbolic world modeling framework for embodied reasoning in robots, utilizing UML diagrams and a novel training pipeline to improve planning and execution success.

Contribution

This work presents a new structured world modeling approach combining UML formalism with a training pipeline, enhancing embodied reasoning in robotic tasks.

Findings

01

OOWM outperforms unstructured baselines in planning coherence.

02

OOWM achieves higher execution success rates.

03

OOWM demonstrates improved structural fidelity in world modeling.

Abstract

Standard Chain-of-Thought (CoT) prompting empowers Large Language Models (LLMs) with reasoning capabilities, yet its reliance on linear natural language is inherently insufficient for effective world modeling in embodied tasks. While text offers flexibility, it fails to explicitly represent the state-space, object hierarchies, and causal dependencies required for robust robotic planning. To address these limitations, we propose Object-Oriented World Modeling (OOWM), a novel framework that structures embodied reasoning through the lens of software engineering formalisms. We redefine the world model not as a latent vector space, but as an explicit symbolic tuple $W = ⟨ S, T ⟩$ : a State Abstraction ( $G_{state}$ ) instantiating the environmental state $S$ , coupled with a Control Policy ( $G_{control}$ ) representing the transition logic $T : S \times A \to S^{'}$ .…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.