Cooperative Optimal Output Tracking for Discrete-Time Multiagent Systems: Stabilizing Policy Iteration Frameworks

Dongdong Li; Jiuxiang Dong

arXiv:2501.06510·eess.SY·January 27, 2026

Cooperative Optimal Output Tracking for Discrete-Time Multiagent Systems: Stabilizing Policy Iteration Frameworks

Dongdong Li, Jiuxiang Dong

PDF

TL;DR

This paper introduces two novel policy iteration-based algorithms for cooperative optimal output tracking in discrete-time multi-agent systems, enabling stabilization from arbitrary initial policies and requiring less data, validated through simulations.

Contribution

The paper develops two new PI-based algorithms for multi-agent systems that relax initial policy constraints and improve data efficiency, with stability analysis and distributed implementation.

Findings

01

Algorithms successfully stabilize multi-agent systems from any initial policy.

02

The Q-learning framework reduces data requirements compared to traditional methods.

03

Simulations confirm the effectiveness of the proposed approaches.

Abstract

This paper proposes two cooperative optimal output tracking (COOT) algorithms based on policy iteration (PI) for discrete-time multi-agent systems with unknown model parameters. First, we establish a stabilizing PI framework that can start from any initial control policy, relaxing the dependence of traditional PI on the initial stabilizing control policy. Then, another efficient and equivalent Q-learning framework is developed, which is shown to require only less system data to get the same results as the stabilizing PI. In the two frameworks, the stabilizing control policy is obtained by gradually iterating the stabilizing virtual system to the actual feedback closed-loop system. Two explicit schemes for adjusting the iteration step-size/coefficient are designed and their stability is analyzed. Finally, the COOT is realized by a distributed feedforward-feedback controller with learned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.