Physically Interpretable World Models via Weakly Supervised Representation Learning

Zhenjiang Mao; Mrinall Eashaan Umasudhan; Ivan Ruchkin

arXiv:2412.12870·cs.LG·April 7, 2026

Physically Interpretable World Models via Weakly Supervised Representation Learning

Zhenjiang Mao, Mrinall Eashaan Umasudhan, Ivan Ruchkin

PDF

TL;DR

This paper introduces PIWM, a framework for learning physically interpretable world models from images using weak supervision, improving interpretability, prediction accuracy, and parameter recovery in physical systems.

Contribution

PIWM aligns latent representations with physical quantities and dynamics without ground-truth annotations, integrating visual, physical, and dynamics models for interpretability.

Findings

01

PIWM achieves accurate long-term predictions in physical systems.

02

PIWM recovers true physical parameters from visual data.

03

PIWM enhances physical grounding over purely data-driven models.

Abstract

Learning predictive models from high-dimensional sensory observations is fundamental for cyber-physical systems, yet the latent representations learned by standard world models lack physical interpretability. This limits their reliability, generalizability, and applicability to safety-critical tasks. We introduce Physically Interpretable World Models (PIWM), a framework that aligns latent representations with real-world physical quantities and constrains their evolution through partially known physical dynamics. Physical interpretability in PIWM is defined by two complementary properties: (i) the learned latent state corresponds to meaningful physical variables, and (ii) its temporal evolution follows physically consistent dynamics. To achieve this without requiring ground-truth physical annotations, PIWM employs weak distribution-based supervision that captures state uncertainty…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.