SigLoMa: Learning Open-World Quadrupedal Loco-Manipulation from Ego-Centric Vision

Shiyi Chen; Haiyi Liu; Mingye Yang; Jiaqi Zhang; Debing Zhang

arXiv:2605.03846·cs.RO·May 6, 2026

SigLoMa: Learning Open-World Quadrupedal Loco-Manipulation from Ego-Centric Vision

Shiyi Chen, Haiyi Liu, Mingye Yang, Jiaqi Zhang, Debing Zhang

PDF

TL;DR

SigLoMa introduces a fully onboard, ego-centric vision-based quadrupedal loco-manipulation system that overcomes traditional limitations using Sigma Points, an ego-centric Kalman Filter, and active sampling, enabling real-world dynamic tasks.

Contribution

The paper presents SigLoMa, a novel onboard system with Sigma Points and ego-centric Kalman filtering that improves open-world quadrupedal loco-manipulation from vision.

Findings

01

Successfully performs dynamic loco-manipulation tasks in real-world settings.

02

Achieves performance comparable to expert human teleoperation.

03

Operates effectively with only a 5Hz perception update rate.

Abstract

Designing an open-world quadrupedal loco-manipulation system is highly challenging. Traditional reinforcement learning frameworks utilizing exteroception often suffer from extreme sample inefficiency and massive sim-to-real gaps. Furthermore, the inherent latency of visual tracking fundamentally conflicts with the high-frequency demands of precise floating-base control. Consequently, existing systems lean heavily on expensive external motion capture and off-board computation. To eliminate these dependencies, we present SigLoMa, a fully onboard, ego-centric vision-based pick-and-place framework. At the core of SigLoMa is the introduction of Sigma Points, a lightweight geometric representation for exteroception that guarantees high scalability and native sim-to-real alignment. To bridge the frequency divide between slow perception and fast control, we design an ego-centric Kalman Filter…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.