Composite Reward Design in PPO-Driven Adaptive Filtering

Abdullah Burkan Bereketoglu

arXiv:2506.06323·eess.SP·June 10, 2025

Composite Reward Design in PPO-Driven Adaptive Filtering

Abdullah Burkan Bereketoglu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel reinforcement learning framework using PPO with a composite reward for adaptive filtering, outperforming traditional methods in non-stationary environments and demonstrating real-time capabilities.

Contribution

It presents a new PPO-based adaptive filtering approach with a composite reward, enabling robust, real-time denoising in dynamic environments, surpassing classical filters.

Findings

01

PPO-based filter outperforms classical filters in various noise conditions.

02

The method generalizes well beyond training data.

03

Achieves real-time adaptive filtering performance.

Abstract

Model-free and reinforcement learning-based adaptive filtering methods are gaining traction for denoising in dynamic, non-stationary environments such as wireless signal channels. Traditional filters like LMS, RLS, Wiener, and Kalman are limited by assumptions of stationary or requiring complex fine-tuning or exact noise statistics or fixed models. This letter proposes an adaptive filtering framework using Proximal Policy Optimization (PPO), guided by a composite reward that balances SNR improvement, MSE reduction, and residual smoothness. Experiments on synthetic signals with various noise types show that our PPO agent generalizes beyond its training distribution, achieving real-time performance and outperforming classical filters. This work demonstrates the viability of policy-gradient reinforcement learning for robust, low-latency adaptive signal filtering.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Bradshard/Reinforcement_Learning/tree/main/rl_signal_filtering
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Adaptive Filtering Techniques · Speech and Audio Processing · Reinforcement Learning in Robotics

MethodsEntropy Regularization · Proximal Policy Optimization