# HiRo-SLAM: A High-Accuracy and Robust Visual-Inertial SLAM System with Precise Camera Projection Modeling and Adaptive Feature Selection

**Authors:** Yujuan Deng, Liang Tian, Xiaohui Hou, Xin Liu, Yonggang Wang, Xingchao Liu, Chunyuan Liao

PMC · DOI: 10.3390/s26020711 · Sensors (Basel, Switzerland) · 2026-01-21

## TL;DR

HiRo-SLAM is a new visual-inertial SLAM system that improves accuracy and robustness by combining precise camera modeling, adaptive feature selection, and robust optimization.

## Contribution

The paper introduces a unified optimization framework integrating precise camera projection modeling and adaptive feature selection for visual-inertial SLAM.

## Key findings

- HiRo-SLAM achieves a 30.0% reduction in absolute trajectory error on the EuRoC MAV dataset compared to strong baselines.
- The system attains millimeter-level accuracy on specific sequences under controlled conditions.
- HiRo-SLAM outperforms state-of-the-art methods on multiple benchmarks including EuRoC MAV, TUM-VI, and OIVIO.

## Abstract

What are the main findings?
We introduce a unified optimization framework that integrates a precise camera projection model (incorporating analytical distortion Jacobians) with Graduated Non-Convexity (GNC) robust estimation. This approach significantly improves system accuracy and stability by simultaneously minimizing error sources and optimizing the backend.The Visibility Pyramid-based Adaptive Non-Maximum Suppression (P-ANMS) mechanism, combined with a hybrid point-line frontend fusing XFeat and SOLD2, addresses core challenges in feature tracking. This integration is particularly effective in environments characterized by weak textures or repetitive structures.

We introduce a unified optimization framework that integrates a precise camera projection model (incorporating analytical distortion Jacobians) with Graduated Non-Convexity (GNC) robust estimation. This approach significantly improves system accuracy and stability by simultaneously minimizing error sources and optimizing the backend.

The Visibility Pyramid-based Adaptive Non-Maximum Suppression (P-ANMS) mechanism, combined with a hybrid point-line frontend fusing XFeat and SOLD2, addresses core challenges in feature tracking. This integration is particularly effective in environments characterized by weak textures or repetitive structures.

What are the implications of the main findings?
Comprehensive validation demonstrates that HiRo-SLAM achieves superior positioning accuracy and scene adaptability across multiple benchmarks, setting a new state-of-the-art performance standard.The integration of precise camera modeling, adaptive feature selection, and robust optimization offers a comprehensive technical solution for deploying high-precision visual-inertial SLAM systems in complex real-world environments.

Comprehensive validation demonstrates that HiRo-SLAM achieves superior positioning accuracy and scene adaptability across multiple benchmarks, setting a new state-of-the-art performance standard.

The integration of precise camera modeling, adaptive feature selection, and robust optimization offers a comprehensive technical solution for deploying high-precision visual-inertial SLAM systems in complex real-world environments.

HiRo-SLAM is a visual-inertial SLAM system developed to achieve high accuracy and enhanced robustness. To address critical limitations of conventional methods, including systematic biases from imperfect camera models, uneven spatial feature distribution, and the impact of outliers, we propose a unified optimization framework that integrates four key innovations. First, Precise Camera Projection Modeling (PCPM) embeds a fully differentiable camera model in nonlinear optimization, ensuring accurate handling of camera intrinsics and distortion to prevent error accumulation. Second, Visibility Pyramid-based Adaptive Non-Maximum Suppression (P-ANMS) quantifies feature point contribution through a multi-scale pyramid, providing uniform visual constraints in weakly textured or repetitive regions. Third, Robust Optimization Using Graduated Non-Convexity (GNC) suppresses outliers through dynamic weighting, preventing convergence to local minima. Finally, the Point-Line Feature Fusion Frontend combines XFeat point features with SOLD2 line features, leveraging multiple geometric primitives to improve perception in challenging environments, such as those with weak textures or repetitive structures. Comprehensive evaluations on the EuRoC MAV, TUM-VI, and OIVIO benchmarks show that HiRo-SLAM outperforms state-of-the-art visual-inertial SLAM methods. On the EuRoC MAV dataset, HiRo-SLAM achieves a 30.0% reduction in absolute trajectory error compared to strong baselines and attains millimeter-level accuracy on specific sequences under controlled conditions. However, while HiRo-SLAM demonstrates state-of-the-art performance in scenarios with moderate texture and minimal motion blur, its effectiveness may be reduced in highly dynamic environments with severe motion blur or extreme lighting conditions.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12845682/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12845682/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC12845682/full.md

---
Source: https://tomesphere.com/paper/PMC12845682