TL;DR
D-CLING introduces a novel fine-tuning approach for Navigation Foundation Models that preserves pre-trained knowledge while efficiently learning geometric cues, enhancing robustness and generalization in real-world navigation tasks.
Contribution
The paper proposes a control-inspired fine-tuning method attaching a trainable residual pathway to preserve pre-trained knowledge during adaptation.
Findings
Enables robust long-horizon navigation with minimal collisions.
Maintains or improves action prediction beyond fine-tuning dataset.
Effective in diverse environments and camera configurations.
Abstract
Navigation Foundation Models (NFMs) trained on large cross-embodied datasets have demonstrated powerful generalizability in various scenarios. Adopting in-domain fine-tuning for an NFM efficiently calibrates the visuomotor policy, promising further improvement even in a novel scenario. However, the fine-tuned models still suffer from poor obstacle avoidance or fail to properly reach the provided goals. Furthermore, model updates using a small subset of data typically erode the pre-trained prior, compromising the pre-training generalization. Consequently, fine-tuning deteriorates the capability of the model for robust and accurate navigation. In this work, we present a novel fine-tuning method that leverages large-scale pre-training while efficiently learning in novel setups, such as environments or camera configurations. In particular, inspired by ControlNet, we fine-tune an NFM by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
