SARO: Space-Aware Robot System for Terrain Crossing via Vision-Language   Model

Shaoting Zhu; Derun Li; Linzhan Mou; Yong Liu; Ningyi Xu; and Hang; Zhao

arXiv:2407.16412·cs.RO·March 18, 2025

SARO: Space-Aware Robot System for Terrain Crossing via Vision-Language Model

Shaoting Zhu, Derun Li, Linzhan Mou, Yong Liu, Ningyi Xu, and Hang, Zhao

PDF

Open Access

TL;DR

SARO is a novel robot system that combines vision-language models, reasoning, and reinforcement learning to enable quadruped robots to navigate complex 3D terrains effectively and robustly.

Contribution

This work introduces SARO, integrating VLM-based reasoning, task decomposition, and PAS-trained control policies for advanced terrain crossing in robotics.

Findings

01

Successfully navigates diverse 3D terrains

02

Demonstrates robustness and accuracy in indoor and outdoor scenarios

03

Generalizes well across various environments

Abstract

The application of vision-language models (VLMs) has achieved impressive success in various robotics tasks. However, there are few explorations for these foundation models used in quadruped robot navigation through terrains in 3D environments. In this work, we introduce SARO (Space Aware Robot System for Terrain Crossing), an innovative system composed of a high-level reasoning module, a closed-loop sub-task execution module, and a low-level control policy. It enables the robot to navigate across 3D terrains and reach the goal position. For high-level reasoning and execution, we propose a novel algorithmic system taking advantage of a VLM, with a design of task decomposition and a closed-loop sub-task execution mechanism. For low-level locomotion control, we utilize the Probability Annealing Selection (PAS) method to effectively train a control policy by reinforcement learning. Numerous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Automated Systems