# Category-Level Object Pose Estimation with Statistic Attention

**Authors:** Changhong Jiang, Xiaoqiao Mu, Bingbing Zhang, Chao Liang, Mujun Xie

PMC · DOI: 10.3390/s24165347 · Sensors (Basel, Switzerland) · 2024-08-19

## TL;DR

This paper introduces SAPENet, a new method for object pose estimation that improves performance by using statistical attention to capture complex geometric relationships.

## Contribution

The novel use of statistical attention in SAPENet captures higher-order feature information for better pose estimation.

## Key findings

- SAPENet achieves an mAP of 49.5 on the 5°2 cm metric, outperforming the baseline by 3.4.
- The method attains state-of-the-art performance on the REAL275 dataset.
- Statistical attention enhances modeling of geometric relationships and detailed object differences.

## Abstract

Six-dimensional object pose estimation is a fundamental problem in the field of computer vision. Recently, category-level object pose estimation methods based on 3D-GC have made significant breakthroughs due to advancements in 3D-GC. However, current methods often fail to capture long-range dependencies, which are crucial for modeling complex and occluded object shapes. Additionally, discerning detailed differences between different objects is essential. Some existing methods utilize self-attention mechanisms or Transformer encoder–decoder structures to address the lack of long-range dependencies, but they only focus on first-order information of features, failing to explore more complex information and neglecting detailed differences between objects. In this paper, we propose SAPENet, which follows the 3D-GC architecture but replaces the 3D-GC in the encoder part with HS-layer to extract features and incorporates statistical attention to compute higher-order statistical information. Additionally, three sub-modules are designed for pose regression, point cloud reconstruction, and bounding box voting. The pose regression module also integrates statistical attention to leverage higher-order statistical information for modeling geometric relationships and aiding regression. Experiments demonstrate that our method achieves outstanding performance, attaining an mAP of 49.5 on the 5°2 cm metric, which is 3.4 higher than the baseline model. Our method achieves state-of-the-art (SOTA) performance on the REAL275 dataset.

## Full-text entities

- **Genes:** SUGP1 (SURP and G-patch domain containing 1) [NCBI Gene 57794] {aka F23858, RBP, SF4}
- **Diseases:** injury to people or property (MESH:C000719191)
- **Chemicals:** HS (MESH:D006859)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11359894/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11359894/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC11359894/full.md

---
Source: https://tomesphere.com/paper/PMC11359894