Cooperative Optimal Output Tracking for Discrete-Time Multiagent Systems: Stabilizing Policy Iteration Frameworks
Dongdong Li, Jiuxiang Dong

TL;DR
This paper introduces two novel policy iteration-based algorithms for cooperative optimal output tracking in discrete-time multi-agent systems, enabling stabilization from arbitrary initial policies and requiring less data, validated through simulations.
Contribution
The paper develops two new PI-based algorithms for multi-agent systems that relax initial policy constraints and improve data efficiency, with stability analysis and distributed implementation.
Findings
Algorithms successfully stabilize multi-agent systems from any initial policy.
The Q-learning framework reduces data requirements compared to traditional methods.
Simulations confirm the effectiveness of the proposed approaches.
Abstract
This paper proposes two cooperative optimal output tracking (COOT) algorithms based on policy iteration (PI) for discrete-time multi-agent systems with unknown model parameters. First, we establish a stabilizing PI framework that can start from any initial control policy, relaxing the dependence of traditional PI on the initial stabilizing control policy. Then, another efficient and equivalent Q-learning framework is developed, which is shown to require only less system data to get the same results as the stabilizing PI. In the two frameworks, the stabilizing control policy is obtained by gradually iterating the stabilizing virtual system to the actual feedback closed-loop system. Two explicit schemes for adjusting the iteration step-size/coefficient are designed and their stability is analyzed. Finally, the COOT is realized by a distributed feedforward-feedback controller with learned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
