Planning as Descent: Goal-Conditioned Latent Trajectory Synthesis in Learned Energy Landscapes
Carlos V\'elez Garc\'ia, Miguel Cazorla, Jorge Pomares

TL;DR
Planning as Descent (PaD) introduces a goal-conditioned energy landscape for trajectory synthesis, enabling gradient-based refinement for offline goal-conditioned reinforcement learning, achieving state-of-the-art results in cube manipulation tasks.
Contribution
PaD is a novel framework that learns an energy function over latent trajectories, allowing gradient-based planning without explicit policies or planners, and improves offline planning performance.
Findings
Achieves 95% success rate on OGBench tasks.
Outperforms prior methods with a peak of 68%.
Training on noisy data further improves success and efficiency.
Abstract
We present Planning as Descent (PaD), a framework for offline goal-conditioned reinforcement learning that grounds trajectory synthesis in verification. Instead of learning a policy or explicit planner, PaD learns a goal-conditioned energy function over entire latent trajectories, assigning low energy to feasible, goal-consistent futures. Planning is realized as gradient-based refinement in this energy landscape, using identical computation during training and inference to reduce train-test mismatch common in decoupled modeling pipelines. PaD is trained via self-supervised hindsight goal relabeling, shaping the energy landscape around the planning dynamics. At inference, multiple trajectory candidates are refined under different temporal hypotheses, and low-energy plans balancing feasibility and efficiency are selected. We evaluate PaD on OGBench cube manipulation tasks. When…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Autonomous Vehicle Technology and Safety
