CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot   Applications

Jan Blumenkamp; Steven Morad; Jennifer Gielis; Amanda Prorok

arXiv:2405.01107·cs.RO·October 17, 2024·2 cites

CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Applications

Jan Blumenkamp, Steven Morad, Jennifer Gielis, Amanda Prorok

PDF

Open Access

TL;DR

CoViS-Net is a decentralized visual spatial foundation model enabling multi-robot pose estimation and spatial understanding in real-time without relying on network infrastructure, demonstrated in formation control tasks.

Contribution

It introduces a fully decentralized, platform-agnostic model for multi-robot spatial understanding that functions without camera overlap or existing networking infrastructure.

Findings

01

Provides accurate relative pose estimates

02

Enables real-time spatial comprehension

03

Supports multi-robot formation control

Abstract

Autonomous robot operation in unstructured environments is often underpinned by spatial understanding through vision. Systems composed of multiple concurrently operating robots additionally require access to frequent, accurate and reliable pose estimates. In this work, we propose CoViS-Net, a decentralized visual spatial foundation model that learns spatial priors from data, enabling pose estimation as well as spatial comprehension. Our model is fully decentralized, platform-agnostic, executable in real-time using onboard compute, and does not require existing networking infrastructure. CoViS-Net provides relative pose estimates and a local bird's-eye-view (BEV) representation, even without camera overlap between robots (in contrast to classical methods). We demonstrate its use in a multi-robot formation control task across various real-world settings. We provide code, models and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques