Learn2Fold: Structured Origami Generation with World Model Planning

Yanjia Huang; Yunuo Chen; Ying Jiang; Jinru Han; Zhengzhong Tu; Yin Yang; Chenfanfu Jiang

arXiv:2603.29585·cs.GR·April 6, 2026

Learn2Fold: Structured Origami Generation with World Model Planning

Yanjia Huang, Yunuo Chen, Ying Jiang, Jinru Han, Zhengzhong Tu, Yin Yang, Chenfanfu Jiang

PDF

TL;DR

Learn2Fold is a neuro-symbolic framework that generates valid origami folding sequences from text by combining language models with a differentiable physical simulator in a planning loop.

Contribution

It introduces a novel approach that decouples semantic generation from physical verification for origami folding from natural language descriptions.

Findings

01

Successfully generates complex origami folds from text prompts.

02

Ensures physical validity through a learned graph-structured world model.

03

Outperforms existing methods in producing feasible folding sequences.

Abstract

The ability to transform a flat sheet into a complex three-dimensional structure is a fundamental test of physical intelligence. Unlike cloth manipulation, origami is governed by strict geometric axioms and hard kinematic constraints, where a single invalid crease or collision can invalidate the entire folding sequence. As a result, origami demands long-horizon constructive reasoning that jointly satisfies precise physical laws and high-level semantic intent. Existing approaches fall into two disjoint paradigms: optimization-based methods enforce physical validity but require dense, precisely specified inputs, making them unsuitable for sparse natural language descriptions, while generative foundation models excel at semantic and perceptual synthesis yet fail to produce long-horizon, physics-consistent folding processes. Consequently, generating valid origami folding sequences directly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.