Self-Supervised Visual Terrain Classification from Unsupervised Acoustic Feature Learning
Jannik Z\"urn, Wolfram Burgard, Abhinav Valada

TL;DR
This paper introduces a self-supervised framework for visual terrain classification in robots, using acoustic features learned from vehicle-terrain interactions to reduce manual labeling and improve segmentation accuracy.
Contribution
It presents a novel unsupervised acoustic feature learning approach that self-supervises visual terrain segmentation, reducing reliance on manual labels and achieving competitive performance.
Findings
Proprioceptive classifier outperforms existing unsupervised methods.
Self-supervised visual segmentation approaches supervised performance.
Framework reduces manual labeling effort significantly.
Abstract
Mobile robots operating in unknown urban environments encounter a wide range of complex terrains to which they must adapt their planned trajectory for safe and efficient navigation. Most existing approaches utilize supervised learning to classify terrains from either an exteroceptive or a proprioceptive sensor modality. However, this requires a tremendous amount of manual labeling effort for each newly encountered terrain as well as for variations of terrains caused by changing environmental conditions. In this work, we propose a novel terrain classification framework leveraging an unsupervised proprioceptive classifier that learns from vehicle-terrain interaction sounds to self-supervise an exteroceptive classifier for pixel-wise semantic segmentation of images. To this end, we first learn a discriminative embedding space for vehicle-terrain interaction sounds from triplets of audio…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
