TL;DR
LLMPhy is a novel framework that combines large language models with physics engines to accurately identify physical parameters and construct digital scene twins for improved physical reasoning.
Contribution
It introduces a black-box optimization method that leverages LLMs for parameter estimation in physical scenes, integrating textual knowledge with physics simulation.
Findings
LLMPhy outperforms prior methods in parameter recovery accuracy.
Achieves state-of-the-art results on new physical reasoning benchmarks.
Demonstrates reliable convergence in zero-shot physical reasoning tasks.
Abstract
Most learning-based approaches to complex physical reasoning sidestep the crucial problem of parameter identification (e.g., mass, friction) that governs scene dynamics, despite its importance in real-world applications such as collision avoidance and robotic manipulation. In this paper, we present LLMPhy, a black-box optimization framework that integrates large language models (LLMs) with physics simulators for physical reasoning. The core insight of LLMPhy is to bridge the textbook physical knowledge embedded in LLMs with the world models implemented in modern physics engines, enabling the construction of digital twins of input scenes via latent parameter estimation. Specifically, LLMPhy decomposes digital twin construction into two subproblems: (i) a continuous problem of estimating physical parameters and (ii) a discrete problem of estimating scene layout. For each subproblem,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
