# VINA-SLAM: A Voxel-Based Inertial and Normal-Aligned LiDAR–IMU SLAM

**Authors:** Ruyang Zhang, Bingyu Sun

PMC · DOI: 10.3390/s26061810 · Sensors (Basel, Switzerland) · 2026-03-13

## TL;DR

VINA-SLAM improves LiDAR–IMU mapping in challenging environments by using a voxel map and normal alignment for better accuracy and stability.

## Contribution

Introduces VINA-SLAM, a LiDAR–IMU SLAM framework using voxel maps and normal-guided alignment for improved performance in geometrically degenerate scenes.

## Key findings

- VINA-SLAM reduces ATE by 25–40% on average in geometrically degenerate environments.
- The system maintains real-time performance at 10 Hz without explicit feature extraction or loop closure.
- Planar regularization and tangent-space metrics enhance rotational constraints and cross-view consistency.

## Abstract

Environments with sparse or repetitive geometric structures, such as long corridors and narrow stairwells, remain challenging for LiDAR–inertial simultaneous localization and mapping (LiDAR–IMU SLAM) due to insufficient geometric observability and unreliable data associations. To address these issues, we propose VINA-SLAM, a novel LiDAR–IMU SLAM framework that constructs a unified global voxel map to explicitly exploit structural consistency. VINA-SLAM continuously tracks surface normals stored in the global voxel map using a normal-guided correspondence strategy, enabling stable scan-to-map alignment in degenerate scenes. Furthermore, a tangent-space metric is introduced to supplement missing rotational constraints around planar regions, providing reliable initial pose estimates for local optimization. A tightly coupled sliding-window bundle adjustment is then formulated by jointly incorporating IMU factors, voxel normal consistency factors, and planar regularization terms. In particular, the minimum eigenvalue of each voxel’s covariance is used as a statistically principled planar constraint, improving the Hessian conditioning and cross-view geometric consistency. The proposed system directly aligns raw LiDAR scans to the voxelized map without explicit feature extraction or loop closure. Experiments on 25 sequences from the HILTI and MARS-LVIG datasets show that VINA-SLAM reduces ATE by 25–40% on average while maintaining real-time performance at 10 Hz in the evaluated geometrically degenerate environments.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13030591/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13030591/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/PMC13030591/full.md

---
Source: https://tomesphere.com/paper/PMC13030591