Multi-CALF: A Policy Combination Approach with Statistical Guarantees

Georgiy Malaniya; Anton Bolychev; Grigory Yaremenko; Anastasia Krasnaya; Pavel Osinenko

arXiv:2505.12350·cs.LG·May 20, 2025

Multi-CALF: A Policy Combination Approach with Statistical Guarantees

Georgiy Malaniya, Anton Bolychev, Grigory Yaremenko, Anastasia Krasnaya, Pavel Osinenko

PDF

Open Access 1 Repo

TL;DR

Multi-CALF is a novel algorithm that combines reinforcement learning policies with statistical guarantees, ensuring stability and improved performance in control tasks.

Contribution

It introduces a policy combination method with formal convergence guarantees and empirical validation for enhanced control performance.

Findings

01

Achieves better control performance than individual policies.

02

Provides formal convergence and stability guarantees.

03

Demonstrates effectiveness on control tasks.

Abstract

We introduce Multi-CALF, an algorithm that intelligently combines reinforcement learning policies based on their relative value improvements. Our approach integrates a standard RL policy with a theoretically-backed alternative policy, inheriting formal stability guarantees while often achieving better performance than either policy individually. We prove that our combined policy converges to a specified goal set with known probability and provide precise bounds on maximum deviation and convergence time. Empirical validation on control tasks demonstrates enhanced performance while maintaining stability guarantees.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aidagroup/multi-calf
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Robot Manipulation and Learning

MethodsSparse Evolutionary Training