CoFi-PGMA: Counterfactual Policy Gradients under Filtered Feedback for Multi-Agent LLMs

Stela Tong; Elai Ben-Gal

arXiv:2604.22785·cs.LG·April 28, 2026

CoFi-PGMA: Counterfactual Policy Gradients under Filtered Feedback for Multi-Agent LLMs

Stela Tong, Elai Ben-Gal

PDF

TL;DR

This paper introduces CoFi-PGMA, a unified reinforcement learning framework for multi-agent large language models that effectively handles filtered feedback in routing and collaborative systems.

Contribution

It develops a counterfactual policy gradient method that corrects learning signals under filtered feedback, applicable to both routing and collaborative multi-agent LLM architectures.

Findings

01

The method improves learning efficiency in multi-agent LLM systems.

02

Counterfactual estimators enable better credit assignment in filtered feedback scenarios.

03

Demonstrated effectiveness on a real-world reasoning dataset.

Abstract

Large language model (LLM) deployments increasingly rely on multi-agent architectures in which multiple models either compete through routing mechanisms or collaborate to produce a final answer. In both settings, the learning signal received by each agent is filtered by the system mechanism. Routing produces selection-gated feedback where only the chosen response is evaluated, while collaboration produces shared rewards that obscure the individual contribution of each agent. As a result, standard RLHF objectives designed for a single deployed policy become misspecified. We introduce CoFi-PGMA (Counterfactual Policy Gradients under Filtered Feedback for Multi-Agent LLMs), a unified framework for learning under filtered feedback in multi-agent LLM systems. Our approach derives a counterfactual per-agent training objective based on marginal contribution, which corrects the learning signal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.