# Render-Rank-Refine: Accurate 6D Indoor Localization via Circular Rendering

**Authors:** Haya Monawwar, Guoliang Fan

PMC · DOI: 10.3390/jimaging12010010 · Journal of Imaging · 2025-12-25

## TL;DR

This paper introduces a new method for accurate 6D indoor camera localization that works well even in complex or ambiguous room layouts.

## Contribution

A two-stage framework called Render-Rank-Refine that achieves better accuracy and speed without requiring detailed floorplans or scene-specific tuning.

## Key findings

- Render-Rank-Refine reduces translation error by 40.4% and rotation error by 29.7% compared to the SPVLoc baseline in ambiguous layouts.
- The method achieves 25.8–26.4 QPS, making it significantly faster than recent comparable methods while maintaining high accuracy.
- The framework operates on coarse semantic meshes and avoids strict geometric assumptions, enabling robust indoor localization.

## Abstract

Accurate six-degree-of-freedom (6-DoF) camera pose estimation is essential for augmented reality, robotics navigation, and indoor mapping. Existing pipelines often depend on detailed floorplans, strict Manhattan-world priors, and dense structural annotations, which lead to failures in ambiguous room layouts where multiple rooms appear in a query image and their boundaries may overlap or be partially occluded. We present Render-Rank-Refine, a two-stage framework operating on coarse semantic meshes without requiring textured models or per-scene fine-tuning. First, panoramas rendered from the mesh enable global retrieval of coarse pose hypotheses. Then, perspective views from the top-k candidates are compared to the query via rotation-invariant circular descriptors, which re-ranks the matches before final translation and rotation refinement. Our method increases camera localization accuracy compared to the state-of-the-art SPVLoc baseline by reducing the translation error by 40.4% and the rotation error by 29.7% in ambiguous layouts, as evaluated on the Zillow Indoor Dataset. In terms of inference throughput, our method achieves 25.8–26.4 QPS, (Queries Per Second) which is significantly faster than other recent comparable methods, while maintaining accuracy comparable to or better than the SPVLoc baseline. These results demonstrate robust, near-real-time indoor localization that overcomes structural ambiguities and heavy geometric assumptions.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12842481/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12842481/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/PMC12842481/full.md

---
Source: https://tomesphere.com/paper/PMC12842481