Planning as Descent: Goal-Conditioned Latent Trajectory Synthesis in Learned Energy Landscapes

Carlos V\'elez Garc\'ia; Miguel Cazorla; Jorge Pomares

arXiv:2512.17846·cs.RO·December 22, 2025

Planning as Descent: Goal-Conditioned Latent Trajectory Synthesis in Learned Energy Landscapes

Carlos V\'elez Garc\'ia, Miguel Cazorla, Jorge Pomares

PDF

Open Access

TL;DR

Planning as Descent (PaD) introduces a goal-conditioned energy landscape for trajectory synthesis, enabling gradient-based refinement for offline goal-conditioned reinforcement learning, achieving state-of-the-art results in cube manipulation tasks.

Contribution

PaD is a novel framework that learns an energy function over latent trajectories, allowing gradient-based planning without explicit policies or planners, and improves offline planning performance.

Findings

01

Achieves 95% success rate on OGBench tasks.

02

Outperforms prior methods with a peak of 68%.

03

Training on noisy data further improves success and efficiency.

Abstract

We present Planning as Descent (PaD), a framework for offline goal-conditioned reinforcement learning that grounds trajectory synthesis in verification. Instead of learning a policy or explicit planner, PaD learns a goal-conditioned energy function over entire latent trajectories, assigning low energy to feasible, goal-consistent futures. Planning is realized as gradient-based refinement in this energy landscape, using identical computation during training and inference to reduce train-test mismatch common in decoupled modeling pipelines. PaD is trained via self-supervised hindsight goal relabeling, shaping the energy landscape around the planning dynamics. At inference, multiple trajectory candidates are refined under different temporal hypotheses, and low-energy plans balancing feasibility and efficiency are selected. We evaluate PaD on OGBench cube manipulation tasks. When…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Autonomous Vehicle Technology and Safety