CAPT: Category-level Articulation Estimation from a Single Point Cloud   Using Transformer

Lian Fu; Ryoichi Ishikawa; Yoshihiro Sato; Takeshi Oishi

arXiv:2402.17360·cs.CV·February 28, 2024·1 cites

CAPT: Category-level Articulation Estimation from a Single Point Cloud Using Transformer

Lian Fu, Ryoichi Ishikawa, Yoshihiro Sato, Takeshi Oishi

PDF

Open Access

TL;DR

This paper introduces CAPT, a transformer-based method for accurately estimating joint parameters and states of articulated objects from a single point cloud, enhancing robustness and precision in robotics and vision applications.

Contribution

The paper presents a novel end-to-end transformer architecture with a motion loss and double voting strategy for improved articulation estimation from point clouds.

Findings

01

Outperforms existing methods on multiple datasets

02

Achieves high precision and robustness in joint parameter estimation

03

Demonstrates effectiveness of transformer architecture in articulated object analysis

Abstract

The ability to estimate joint parameters is essential for various applications in robotics and computer vision. In this paper, we propose CAPT: category-level articulation estimation from a point cloud using Transformer. CAPT uses an end-to-end transformer-based architecture for joint parameter and state estimation of articulated objects from a single point cloud. The proposed CAPT methods accurately estimate joint parameters and states for various articulated objects with high precision and robustness. The paper also introduces a motion loss approach, which improves articulation estimation performance by emphasizing the dynamic features of articulated objects. Additionally, the paper presents a double voting strategy to provide the framework with coarse-to-fine parameter estimation. Experimental results on several category datasets demonstrate that our methods outperform existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Structural Health Monitoring Techniques · Optical measurement and interference techniques

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Dropout · Dense Connections · Label Smoothing · Adam · Softmax · Layer Normalization · Multi-Head Attention