CACTO-BIC: Scalable Actor-Critic Learning via Biased Sampling and GPU-Accelerated Trajectory Optimization

Elisa Alboni; Pietro Noah Crestaz; Elias Fontanari; Andrea Del Prete

arXiv:2602.19699·cs.RO·February 24, 2026

CACTO-BIC: Scalable Actor-Critic Learning via Biased Sampling and GPU-Accelerated Trajectory Optimization

Elisa Alboni, Pietro Noah Crestaz, Elias Fontanari, Andrea Del Prete

PDF

Open Access

TL;DR

CACTO-BIC enhances scalable actor-critic learning by biased sampling and GPU acceleration, improving efficiency and applicability to high-dimensional systems in real-time control tasks.

Contribution

It introduces CACTO-BIC, a method that improves data efficiency and reduces computation time for scalable actor-critic learning using biased sampling and GPU acceleration.

Findings

01

Improved sample efficiency over CACTO.

02

Faster computation compared to prior methods.

03

Effective on high-dimensional systems like AlienGO.

Abstract

Trajectory Optimization (TO) and Reinforcement Learning (RL) offer complementary strengths for solving optimal control problems. TO efficiently computes locally optimal solutions but can struggle with non-convexity, while RL is more robust to non-convexity at the cost of significantly higher computational demands. CACTO (Continuous Actor-Critic with Trajectory Optimization) was introduced to combine these advantages by learning a warm-start policy that guides the TO solver towards low-cost trajectories. However, scalability remains a key limitation, as increasing system complexity significantly raises the computational cost of TO. This work introduces CACTO-BIC to address these challenges. CACTO-BIC improves data efficiency by biasing initial-state sampling leveraging a property of the value function associated with locally optimal policies; moreover, it reduces computation time by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Path Planning Algorithms · Reinforcement Learning in Robotics · Spacecraft Dynamics and Control