TSM-Pose: Topology-Aware Learning with Semantic Mamba for Category-Level Object Pose Estimation

Jinshuo Liu; Bingtao Ma; Junlin Su; Guanyuan Pan; Beining Wu; Cheng Yang; Jiaxuan Lu; Chenggang Yan; Shuai Wang

arXiv:2604.16954·cs.CV·April 21, 2026

TSM-Pose: Topology-Aware Learning with Semantic Mamba for Category-Level Object Pose Estimation

Jinshuo Liu, Bingtao Ma, Junlin Su, Guanyuan Pan, Beining Wu, Cheng Yang, Jiaxuan Lu, Chenggang Yan, Shuai Wang

PDF

TL;DR

TSM-Pose introduces a topology-aware learning framework with semantic Mamba modules to improve category-level object pose estimation, enhancing generalization to unseen instances.

Contribution

The paper proposes a novel topology extractor and Mamba-based semantic aggregator to better capture structural and semantic features for pose estimation.

Findings

01

TSM-Pose outperforms state-of-the-art methods on three benchmark datasets.

02

The topology extractor improves global structural representation.

03

Semantic Mamba enhances keypoint expressiveness and long-range dependency modeling.

Abstract

Category-level object pose estimation is fundamental for embodied intelligence, yet achieving robust generalization to unseen instances remains challenging. However, existing methods mainly rely on simple feature extraction and aggregation, which struggle to capture category-shared topological structures and conduct semantic keypoint modeling, limiting their generalization. To address these, we propose a \textbf{T}opology-Aware Learning with \textbf{S}emantic \textbf{M}amba for Category-Level \textbf{P}ose Estimation framework (TSM-Pose). Specifically, we introduce a Topology Extractor to capture the global topological representation of the point cloud, which is integrated into local geometry features and enables robust category-level structural representation. Simultaneously, we propose a Mamba-based Global Semantic Aggregator that injects semantics priors into keypoints to enhance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.