StructBiHOI: Structured Articulation Modeling for Long--Horizon Bimanual Hand--Object Interaction Generation
Zhi Wang, Liu Liu, Ruonan Liu, Dan Guo, Meng Wang

TL;DR
StructBiHOI introduces a hierarchical framework for long-horizon bimanual hand-object interaction generation, disentangling joint planning from frame-level refinement to improve stability, realism, and efficiency.
Contribution
It presents a novel structured articulation modeling approach combining jointVAE and maniVAE with a diffusion denoiser for stable, long-term bimanual interaction synthesis.
Findings
Achieves superior long-horizon stability and motion realism.
Demonstrates improved computational efficiency over baselines.
Excels in complex dual-hand coordination tasks.
Abstract
Recent progress in 3D hand--object interaction (HOI) generation has primarily focused on single--hand grasp synthesis, while bimanual manipulation remains significantly more challenging. Long--horizon planning instability, fine--grained joint articulation, and complex cross--hand coordination make coherent bimanual generation difficult, especially under multimodal conditions. Existing approaches often struggle to simultaneously ensure temporal consistency, physical plausibility, and semantic alignment over extended sequences. We propose StructBiHOI, a Structured articulation modeling framework for long-horizon Bimanual HOI generation. Our key insight is to structurally disentangle temporal joint planning from frame--level manipulation refinement. Specifically, a jointVAE models long-term joint evolution conditioned on object geometry and task semantics, while a maniVAE refines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Human Motion and Animation · Social Robot Interaction and HRI
