Pixel2Phys: Distilling Governing Laws from Visual Dynamics
Ruikun Li, Jun Yao, Yingfan Hua, Shixiang Tang, Biqing Qi, Bin Liu, Wanli Ouyang, Yan Lu

TL;DR
Pixel2Phys is a framework that automatically extracts interpretable physical laws from raw visual data by mimicking human scientific reasoning through iterative hypothesis testing and refinement.
Contribution
It introduces a novel multi-agent system that distills governing equations from high-dimensional videos, bridging the gap between raw data and structured physical knowledge.
Findings
Successfully discovers accurate physical laws from diverse videos.
Maintains stable long-term extrapolation where baselines fail.
Demonstrates interpretability and robustness of the extracted equations.
Abstract
Discovering physical laws directly from high-dimensional visual data is a long-standing human pursuit but remains a formidable challenge for machines, representing a fundamental goal of scientific intelligence. This task is inherently difficult because physical knowledge is low-dimensional and structured, whereas raw video observations are high-dimensional and redundant, with most pixels carrying little or no physical meaning. Extracting concise, physically relevant variables from such noisy data remains a key obstacle. To address this, we propose Pixel2Phys, a collaborative multi-agent framework adaptable to any Multimodal Large Language Model (MLLM). It emulates human scientific reasoning by employing a structured workflow to extract formalized physical knowledge through iterative hypothesis generation, validation, and refinement. By repeatedly formulating, and refining candidate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI)
