ROFT-VINS: Robust Feature Tracking-based Visual-Inertial State Estimation for Harsh Environment

Sanghyun Park; and Soohee Han

arXiv:2603.18746·cs.RO·March 20, 2026

ROFT-VINS: Robust Feature Tracking-based Visual-Inertial State Estimation for Harsh Environment

Sanghyun Park, and Soohee Han

PDF

Open Access

TL;DR

This paper introduces ROFT-VINS, a deep learning-based visual feature tracking method that enhances robustness in challenging environments for visual-inertial SLAM systems, improving localization accuracy in textureless and rapidly changing lighting conditions.

Contribution

The paper presents a novel deep learning approach for robust visual feature tracking integrated into VINS-Fusion, addressing challenges in textureless and dynamic lighting environments.

Findings

01

Enhanced feature tracking robustness in textureless scenes

02

Improved localization accuracy under lighting variations

03

Successful integration into VINS-Fusion system

Abstract

SLAM (Simultaneous Localization and Mapping) and Odometry are important systems for estimating the position of mobile devices, such as robots and cars, utilizing one or more sensors. Particularly in camera-based SLAM or Odometry, effectively tracking visual features is important as it significantly impacts system performance. In this paper, we propose a method that leverages deep learning to robustly track visual features in monocular camera images. This method operates reliably even in textureless environments and situations with rapid lighting changes. Additionally, we evaluate the performance of our proposed method by integrating it into VINS-Fusion (Monocular-Inertial), a commonly used Visual-Inertial Odometry (VIO) system.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques