EchoPT: A Pretrained Transformer Architecture that Predicts 2D In-Air Sonar Images for Mobile Robotics
Jan Steckel, Wouter Jansen, Nico Huebel

TL;DR
EchoPT is a transformer-based model that predicts 2D in-air sonar images from past data and robot motion, enabling predictive perception for mobile robots using ultrasound sensors.
Contribution
The paper introduces EchoPT, a novel pretrained transformer architecture specifically designed for predicting sonar images in robotic perception tasks.
Findings
EchoPT outperforms several state-of-the-art methods in sonar image prediction.
The model enables improved predictive perception in robotic navigation tasks.
EchoPT demonstrates robustness in real-world robotic experiments.
Abstract
The predictive brain hypothesis suggests that perception can be interpreted as the process of minimizing the error between predicted perception tokens generated by an internal world model and actual sensory input tokens. When implementing working examples of this hypothesis in the context of in-air sonar, significant difficulties arise due to the sparse nature of the reflection model that governs ultrasonic sensing. Despite these challenges, creating consistent world models using sonar data is crucial for implementing predictive processing of ultrasound data in robotics. In an effort to enable robust robot behavior using ultrasound as the sole exteroceptive sensor modality, this paper introduces EchoPT, a pretrained transformer architecture designed to predict 2D sonar images from previous sensory data and robot ego-motion information. We detail the transformer architecture that drives…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging
