MotionBits: Video Segmentation through Motion-Level Analysis of Rigid Bodies
Howard H. Qian, Kejia Ren, Yu Xiang, Vicente Ordonez, Kaiyu Hang

TL;DR
MotionBits introduces a motion-based segmentation method for rigid bodies that improves understanding of physical interactions in robotic and human environments, surpassing existing models in accuracy and utility.
Contribution
The paper presents the novel MotionBit concept, a new benchmark MoRiBo, and a learning-free segmentation method that outperforms current state-of-the-art approaches.
Findings
Outperforms state-of-the-art methods by 37.3% in macro-averaged mIoU.
Provides a new benchmark for evaluating rigid-body segmentation.
Demonstrates effectiveness in downstream embodied reasoning tasks.
Abstract
Rigid bodies constitute the smallest manipulable elements in the real world, and understanding how they physically interact is fundamental to embodied reasoning and robotic manipulation. Thus, accurate detection, segmentation, and tracking of moving rigid bodies is essential for enabling reasoning modules to interpret and act in diverse environments. However, current segmentation models trained on semantic grouping are limited in their ability to provide meaningful interaction-level cues for completing embodied tasks. To address this gap, we introduce MotionBit, a novel concept that, unlike prior formulations, defines the smallest unit in motion-based segmentation through kinematic spatial twist equivalence, independent of semantics. In this paper, we contribute (1) the MotionBit concept and definition, (2) a hand-labeled benchmark, called MoRiBo, for evaluating moving rigid-body…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Social Robot Interaction and HRI
