VertiCoder: Self-Supervised Kinodynamic Representation Learning on   Vertically Challenging Terrain

Mohammad Nazeri; Aniket Datar; Anuj Pokhrel; Chenhui Pan; Garrett; Warnell; and Xuesu Xiao

arXiv:2409.11570·cs.RO·March 10, 2025

VertiCoder: Self-Supervised Kinodynamic Representation Learning on Vertically Challenging Terrain

Mohammad Nazeri, Aniket Datar, Anuj Pokhrel, Chenhui Pan, Garrett, Warnell, and Xuesu Xiao

PDF

Open Access 1 Repo

TL;DR

VertiCoder introduces a self-supervised Transformer-based method for kinodynamic representation learning, enabling versatile robot mobility on challenging terrain with fewer parameters and robust generalization across tasks and environments.

Contribution

The paper proposes VertiCoder, a novel self-supervised learning framework using Transformers for kinodynamic modeling, capable of handling multiple tasks with a single representation and fewer parameters.

Findings

01

Outperforms specialized models on four downstream tasks

02

Uses 77% fewer parameters than comparable models

03

Achieves comparable results to state-of-the-art kinodynamic methods in real-world tests

Abstract

We present VertiCoder, a self-supervised representation learning approach for robot mobility on vertically challenging terrain. Using the same pre-training process, VertiCoder can handle four different downstream tasks, including forward kinodynamics learning, inverse kinodynamics learning, behavior cloning, and patch reconstruction with a single representation. VertiCoder uses a TransformerEncoder to learn the local context of its surroundings by random masking and next patch reconstruction. We show that VertiCoder achieves better performance across all four different tasks compared to specialized End-to-End models with 77% fewer parameters. We also show VertiCoder's comparable performance against state-of-the-art kinodynamic modeling and planning approaches in real-world robot deployment. These results underscore the efficacy of VertiCoder in mitigating overfitting and fostering more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mhnazeri/verticoder
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Video Surveillance and Tracking Methods