SIGN: Safety-Aware Image-Goal Navigation for Autonomous Drones via Reinforcement Learning

Zichen Yan; Rui Huang; Lei He; Shao Guo; Lin Zhao

arXiv:2508.12394·cs.RO·December 22, 2025

SIGN: Safety-Aware Image-Goal Navigation for Autonomous Drones via Reinforcement Learning

Zichen Yan, Rui Huang, Lei He, Shao Guo, Lin Zhao

PDF

TL;DR

This paper introduces SIGN, a reinforcement learning-based framework enabling autonomous drones to perform image-goal navigation safely and effectively in unknown environments without relying on external localization or mapping.

Contribution

The paper presents a novel sim-to-real RL framework for drone ImageNav, incorporating auxiliary visual tasks and a depth-based safety module for comprehensive navigation.

Findings

01

Effective end-to-end ImageNav for drones demonstrated in simulation and real-world tests.

02

Enhanced visual representation through auxiliary tasks improves policy training.

03

Real-time obstacle avoidance enables safe navigation in cluttered environments.

Abstract

Image-goal navigation (ImageNav) tasks a robot with autonomously exploring an unknown environment and reaching a location that visually matches a given target image. While prior works primarily study ImageNav for ground robots, enabling this capability for autonomous drones is substantially more challenging due to their need for high-frequency feedback control and global localization for stable flight. In this paper, we propose a novel sim-to-real framework that leverages reinforcement learning (RL) to achieve ImageNav for drones. To enhance visual representation ability, our approach trains the vision backbone with auxiliary tasks, including image perturbations and future transition prediction, which results in more effective policy training. The proposed algorithm enables end-to-end ImageNav with direct velocity control, eliminating the need for external localization. Furthermore, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.