Habilis-$\beta$: A Fast-Motion and Long-Lasting On-Device Vision-Language-Action Model
Tommoro Robotics: Jesoon Kang, Taegeon Park, Jisu An, Soo Min Kimm, Jaejoon Kim, Jinu Pahk, Byungju Kim, Junseok Lee, Namheon Baek, Sungwan Ha, Hojun Baek, Eduardo Ayerve Cruz, Wontae Kim, Junghyeon Choi, Yousuk Lee, Joonmo Han, Sunghyun Cho, Sunghyun Kwon, Soyoung Lee

TL;DR
Habilis-$\beta$ is a novel on-device vision-language-action model optimized for continuous, real-world tasks, demonstrating superior speed and robustness through innovative training and evaluation methods.
Contribution
The paper introduces Habilis-$\beta$, a new VLA model with a continuous-run evaluation protocol and advanced training techniques for improved real-world performance.
Findings
Achieves 572.6 TPH and 39.2 s MTBI in simulation
Achieves 124 TPH and 137.4 s MTBI in real-world
Sets new state-of-the-art on RoboTwin 2.0 leaderboard
Abstract
We introduce Habilis-, a fast-motion and long-lasting on-device vision-language-action (VLA) model designed for real-world deployment. Current VLA evaluation remains largely confined to single-trial success rates under curated resets, which fails to capture the fast-motion and long-lasting capabilities essential for practical operation. To address this, we introduce the Productivity-Reliability Plane (PRP), which evaluates performance through Tasks per Hour (TPH) and Mean Time Between Intervention (MTBI) under a continuous-run protocol that demands both high-speed execution and sustained robustness. Habilis- achieves high performance by integrating language-free pre-training on large-scale play data for robust interaction priors with post-training on cyclic task demonstrations that capture state drift across consecutive task iterations. The system further employs ESPADA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Social Robot Interaction and HRI
