Generative 6D Pose Estimation via Conditional Flow Matching

Amir Hamza; Davide Boscaini; Weihang Li; Benjamin Busam; Fabio Poiesi

arXiv:2602.19719·cs.CV·February 24, 2026

Generative 6D Pose Estimation via Conditional Flow Matching

Amir Hamza, Davide Boscaini, Weihang Li, Benjamin Busam, Fabio Poiesi

PDF

Open Access

TL;DR

Flose introduces a novel generative approach for 6D pose estimation that leverages conditional flow matching with appearance-based features, outperforming prior methods on multiple datasets.

Contribution

The paper presents Flose, a new generative method using conditional flow matching with semantic features for robust 6D pose estimation, especially in symmetric objects.

Findings

01

Flose achieves +4.5 average recall over prior methods.

02

Incorporates appearance features to resolve symmetries.

03

Validated on five BOP benchmark datasets.

Abstract

Existing methods for instance-level 6D pose estimation typically rely on neural networks that either directly regress the pose in $SE (3)$ or estimate it indirectly via local feature matching. The former struggle with object symmetries, while the latter fail in the absence of distinctive local features. To overcome these limitations, we propose a novel formulation of 6D pose estimation as a conditional flow matching problem in $R^{3}$ . We introduce Flose, a generative method that infers object poses via a denoising process conditioned on local features. While prior approaches based on conditional flow matching perform denoising solely based on geometric guidance, Flose integrates appearance-based semantic features to mitigate ambiguities caused by object symmetries. We further incorporate RANSAC-based registration to handle outliers. We validate Flose on five datasets…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · 3D Shape Modeling and Analysis