Model-Free Transformer Framework for 6-DoF Pose Estimation of Textureless Tableware Objects

Jungwoo Lee; Hyogon Kim; Ji-Wook Kwon; Sung-Jo Yun; Na-Hyun Lee; Young-Ho Choi; Goobong Chung; Jinho Suh

PMC · DOI:10.3390/s25196167·October 5, 2025

Model-Free Transformer Framework for 6-DoF Pose Estimation of Textureless Tableware Objects

Jungwoo Lee, Hyogon Kim, Ji-Wook Kwon, Sung-Jo Yun, Na-Hyun Lee, Young-Ho Choi, Goobong Chung, Jinho Suh

PDF

Open Access

TL;DR

This paper introduces a new method for estimating the 3D position and orientation of textureless tableware using a transformer model and depth data, enabling robots to grasp objects more effectively.

Contribution

A model-free and texture-free 6-DoF pose estimation framework using transformer architecture and geometry-based features from depth images.

Findings

01

The method achieves an average rotational error of 3.53 degrees and translational error of 13.56 mm.

02

Real-world experiments show successful autonomous recognition and collection of tableware by a mobile robot.

03

Geometry-based features like surface vertices and rim normals provide strong structural priors for pose estimation.

Abstract

Tableware objects such as plates, bowls, and cups are usually textureless, uniform in color, and vary widely in shape, making it difficult to apply conventional pose estimation methods that rely on texture cues or object-specific CAD models. These limitations present a significant obstacle to robotic manipulation in restaurant environments, where reliable six-degree-of-freedom (6-DoF) pose estimation is essential for autonomous grasping and collection. To address this problem, we propose a model-free and texture-free 6-DoF pose estimation framework based on a transformer encoder architecture. This method uses only geometry-based features extracted from depth images, including surface vertices and rim normals, which provide strong structural priors. The pipeline begins with object detection and segmentation using a pretrained video foundation model, followed by the generation of…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Chemicals1

DINO

Diseases2

injury to infection

Figures20

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · Soft Robotics and Applications