FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution
Xiaoyi Liu, Hao Tang

TL;DR
FOLIAGE introduces a physics-informed, multimodal world model that predicts unbounded surface growth, enabling robust physical intelligence through a novel surface evolution framework and comprehensive evaluation suite.
Contribution
The paper presents FOLIAGE, a new multimodal world model for surface growth prediction, integrating physics-aware components and a comprehensive evaluation platform for physical intelligence.
Findings
FOLIAGE outperforms specialized baselines in surface growth tasks.
The model demonstrates robustness across dynamic environments.
It effectively handles sensor dropout and modality transfer challenges.
Abstract
Physical intelligence -- anticipating and shaping the world from partial, multisensory observations -- is critical for next-generation world models. We propose FOLIAGE, a physics-informed multimodal world model for unbounded accretive surface growth. In its Action-Perception loop, a unified context encoder maps images, mesh connectivity, and point clouds to a shared latent state. A physics-aware predictor, conditioned on physical control actions, advances this latent state in time to align with the target latent of the surface, yielding a Modality-Agnostic Growth Embedding (MAGE) that interfaces with critic heads for downstream objectives. FOLIAGE's Accretive Graph Network (AGN) captures dynamic connectivity through Age Positional Encoding and Energy-Gated Message-Passing. Geometry-Correspondence Fusion and Cross-Patch Masking enhance MAGE's expressiveness, while Hierarchical Pooling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsALIGN
