ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement
Yutong Shen, Hangxu Liu, Lei Zhang, Penghui Liu, Yinqi Liu, Liuxiang Yang, Tongtong Feng

TL;DR
ALAS introduces a biologically inspired dual-stream framework for long-horizon tasks in human-scene interaction, enabling better generalization and efficiency across domains and skills.
Contribution
This work proposes ALAS, a novel cross-domain learning framework with environment and skill disentanglement inspired by brain pathways, improving long-horizon task performance.
Findings
ALAS achieves 23% higher subtask success rate on average.
ALAS improves execution efficiency by 29%.
Extensive experiments validate ALAS's cross-domain and cross-skill transfer capabilities.
Abstract
Long-Horizon (LH) tasks in Human-Scene Interaction (HSI) are complex multi-step tasks that require continuous planning, sequential decision-making, and extended execution across domains to achieve the final goal. However, existing methods heavily rely on skill chaining by concatenating pre-trained subtasks, with environment observations and self-state tightly coupled, lacking the ability to generalize to new combinations of environments and skills, failing to complete various LH tasks across domains. To solve this problem, this paper presents ALAS, a cross-domain learning framework for LH tasks via biologically inspired dual-stream disentanglement. Inspired by the brain's "where-what" dual pathway mechanism, ALAS comprises two core modules: i) an environment learning module for spatial understanding, which captures object functions, spatial relationships, and scene semantics, achieving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
