Proximal Policy Optimization-based Transmit Beamforming and Phase-shift Design in an IRS-aided ISAC System for the THz Band
Xiangnan Liu, Haijun Zhang, Keping Long, Mingyu Zhou, Yonghui Li, and, H. Vincent Poor

TL;DR
This paper introduces a reinforcement learning approach using proximal policy optimization to jointly optimize transmit beamforming and phase-shift design in an IRS-aided THz ISAC system, enhancing system capacity.
Contribution
It develops a novel PPO-based joint optimization framework for IRS-assisted THz ISAC systems, including a distributed multi-threading version for multi-user MIMO scenarios.
Findings
PPO effectively optimizes beamforming and phase shifts.
Distributed PPO improves multi-user MIMO performance.
Simulation confirms the algorithm's effectiveness.
Abstract
In this paper, an IRS-aided integrated sensing and communications (ISAC) system operating in the terahertz (THz) band is proposed to maximize the system capacity. Transmit beamforming and phase-shift design are transformed into a universal optimization problem with ergodic constraints. Then the joint optimization of transmit beamforming and phase-shift design is achieved by gradient-based, primal-dual proximal policy optimization (PPO) in the multi-user multiple-input single-output (MISO) scenario. Specifically, the actor part generates continuous transmit beamforming and the critic part takes charge of discrete phase shift design. Based on the MISO scenario, we investigate a distributed PPO (DPPO) framework with the concept of multi-threading learning in the multi-user multiple-input multiple-output (MIMO) scenario. Simulation results demonstrate the effectiveness of the primal-dual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsEntropy Regularization · Proximal Policy Optimization
