MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation for Effective-and-Efficient Vision-and-Language Navigation
Liuyi Wang, Zongtao He, Mengjiao Shen, Jingwei Yang, Chengju Liu,, Qijun Chen

TL;DR
This paper introduces MAGIC, a novel knowledge distillation framework for creating lightweight, efficient vision-and-language navigation models that outperform previous methods and are suitable for real-time robotics applications.
Contribution
MAGIC combines meta-ability guided distillation with interactive chain learning, enabling effective multi-step teacher-student co-evolution for VLN tasks.
Findings
MAGIC-S, with only 5% of the teacher's size, outperforms previous methods.
MAGIC-L surpasses state-of-the-art by 5.84% in SPL and 3.18% in SR.
The method demonstrates superior real-time performance on a new dataset.
Abstract
Despite the remarkable developments of recent large models in Embodied Artificial Intelligence (E-AI), their integration into robotics is hampered by their excessive parameter sizes and computational demands. Towards the Vision-and-Language Navigation (VLN) task, a core task in E-AI, this paper reveals the great potential of using knowledge distillation for obtaining lightweight student models by proposing a Meta-Ability Guided Interactive Chain-of-distillation (MAGIC) method. Specifically, a Meta-Ability Knowledge Distillation (MAKD) framework is proposed for decoupling and refining the necessary meta-abilities of VLN agents. A Meta-Knowledge Randomization Weighting (MKRW) and a Meta-Knowledge Transferable Determination (MKTD) module are incorporated to dynamically adjust aggregation weights at the meta-ability and sample levels, respectively. Move beyond the traditional one-step…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Speech and dialogue systems
MethodsSemi-Pseudo-Label · Knowledge Distillation
