Physically Interpretable World Models via Weakly Supervised Representation Learning
Zhenjiang Mao, Mrinall Eashaan Umasudhan, Ivan Ruchkin

TL;DR
This paper introduces PIWM, a framework for learning physically interpretable world models from images using weak supervision, improving interpretability, prediction accuracy, and parameter recovery in physical systems.
Contribution
PIWM aligns latent representations with physical quantities and dynamics without ground-truth annotations, integrating visual, physical, and dynamics models for interpretability.
Findings
PIWM achieves accurate long-term predictions in physical systems.
PIWM recovers true physical parameters from visual data.
PIWM enhances physical grounding over purely data-driven models.
Abstract
Learning predictive models from high-dimensional sensory observations is fundamental for cyber-physical systems, yet the latent representations learned by standard world models lack physical interpretability. This limits their reliability, generalizability, and applicability to safety-critical tasks. We introduce Physically Interpretable World Models (PIWM), a framework that aligns latent representations with real-world physical quantities and constrains their evolution through partially known physical dynamics. Physical interpretability in PIWM is defined by two complementary properties: (i) the learned latent state corresponds to meaningful physical variables, and (ii) its temporal evolution follows physically consistent dynamics. To achieve this without requiring ground-truth physical annotations, PIWM employs weak distribution-based supervision that captures state uncertainty…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
