TL;DR
Sparse ActionGen (SAG) accelerates diffusion-based action generation for robots by using adaptive pruning and activation reuse, achieving up to 4x speedup in real-time control without performance loss.
Contribution
SAG introduces a rollout-adaptive prune-then-reuse mechanism and an environment-aware diffusion pruner for efficient, real-time sparse action generation in robotic applications.
Findings
Achieves up to 4× speedup in action generation.
Maintains performance while accelerating diffusion process.
Demonstrates effectiveness on multiple robotic benchmarks.
Abstract
Diffusion Policy has dominated action generation due to its strong capabilities for modeling multi-modal action distributions, but its multi-step denoising processes make it impractical for real-time visuomotor control. Existing caching-based acceleration methods typically rely on schedules that fail to adapt to the of robot-environment interactions, thereby leading to suboptimal performance. In this paper, we propose parse ctionen () for extremely sparse action generation. To accommodate the iterative interactions, SAG customizes a rollout-adaptive prune-then-reuse mechanism that first identifies prunable computations globally and then reuses cached activations to substitute them during action diffusion. To capture the rollout dynamics, SAG parameterizes an…
Peer Reviews
Decision·Submitted to ICLR 2026
All design choices seem reasonable and are properly ablated, and overall improvement over other diffusion pruning methods seems substantial.
- I found the motivation in the abstract and introduction at odds with the actual algorithm. The authors motivated the study of Diffusion policies by their ability to model multi-modal distributions. Since many deep RL algorithms' exploration heuristics use stochastic policies, it seems indeed important to have policies that can model a wider range of distributions. However, the authors only consider behavioral cloning of an expert policy, and while multi-modal and stochastic policies might be h
1. The paper addresses a **highly important and timely research problem**, especially as diffusion models continue to gain prominence in **imitation learning**, **reinforcement learning**, and **Vision-Language-Action (VLA)** modeling. 2. The paper presents **extensive simulation experiments**, offering strong empirical evidence for the effectiveness and robustness of the proposed method. 3. The paper is **well-written**, **clearly structured**, and **easy to follow**, effectively communicatin
### Major Weakness: 1. The authors are strongly encouraged to include comparisons with traditional **diffusion acceleration methods**, such as **DDIM** or **Consistency Policy**, to enhance the **completeness** and **thoroughness** of the paper’s experimental evaluation. 2. It is noted that in RoboMimic tasks, SAG even achieves higher performance compared to Diffusion Policy with full denoising process. The authors are highly recommended to dive deeper into this phenomenon instead of just conc
1. Clear problem framing and motivation. The paper grounds the latency issue of diffusion policies in realistic control frequencies (e.g., 50 steps × 1 ms ≈ 50 ms → 20 Hz on RTX 4090; insufficient for Franka 50–1000 Hz), which is a compelling, concrete rationale for acceleration beyond image generation settings. 2. Methodological novelty: observation-conditioned, real-time pruning. The real-time diffusion pruner predicts a binary mask for all K timesteps and 3L blocks in a single forward pass
1. Lack of real-robot validation. All evaluations appear to be simulation-based (RoboMimic tasks, Franka Kitchen). For claims of real-time control, a small-scale hardware validation (latency stability, sensor noise, control jitter) would substantially strengthen the case. 2. Runtime analysis is mostly relative; absolute latencies are under-reported. While speedup factors are clear, the paper would benefit from absolute inference time per control step (ms) and achieved control frequency (Hz) fo
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Human Motion and Animation
