Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning
Yuhui Wang, Qingyuan Wu, Dylan R. Ashley, Francesco Faccio, Weida Li, Chao Huang, J\"urgen Schmidhuber

TL;DR
This paper introduces DT-VIN, a scalable extension of Value Iteration Networks capable of handling 5000 layers for long-term planning in complex environments by enhancing representation and gradient flow.
Contribution
The paper proposes Dynamic Transition VIN (DT-VIN), which significantly improves the scalability of VINs for long-term planning by augmenting the latent MDP and introducing an adaptive highway loss.
Findings
Scales to 5000 layers in planning tasks
Successfully navigates complex maze and control environments
Solves real-world Lunar rover navigation challenges
Abstract
The Value Iteration Network (VIN) is an end-to-end differentiable neural network architecture for planning. It exhibits strong generalization to unseen domains by incorporating a differentiable planning module that operates on a latent Markov Decision Process (MDP). However, VINs struggle to scale to long-term and large-scale planning tasks, such as navigating a 100x100 maze -- a task that typically requires thousands of planning steps to solve. We observe that this deficiency is due to two issues: the representation capacity of the latent MDP and the planning module's depth. We address these by augmenting the latent MDP with a dynamic transition kernel, dramatically improving its representational capacity, and, to mitigate the vanishing gradient problem, introduce an "adaptive highway loss" that constructs skip connections to improve gradient flow. We evaluate our method on 2D/3D maze…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSystems Engineering Methodologies and Applications · Model-Driven Software Engineering Techniques
