Composite Reward Design in PPO-Driven Adaptive Filtering
By: Abdullah Burkan Bereketoglu
Potential Business Impact:
Cleans up noisy signals better than old ways.
Model-free and reinforcement learning-based adaptive filtering methods are gaining traction for denoising in dynamic, non-stationary environments such as wireless signal channels. Traditional filters like LMS, RLS, Wiener, and Kalman are limited by assumptions of stationary or requiring complex fine-tuning or exact noise statistics or fixed models. This letter proposes an adaptive filtering framework using Proximal Policy Optimization (PPO), guided by a composite reward that balances SNR improvement, MSE reduction, and residual smoothness. Experiments on synthetic signals with various noise types show that our PPO agent generalizes beyond its training distribution, achieving real-time performance and outperforming classical filters. This work demonstrates the viability of policy-gradient reinforcement learning for robust, low-latency adaptive signal filtering.
Similar Papers
A Simple and Effective Reinforcement Learning Method for Text-to-Image Diffusion Fine-tuning
Machine Learning (CS)
Makes AI art generators better and faster.
Preference Optimization for Combinatorial Optimization Problems
Machine Learning (CS)
Teaches computers to solve hard puzzles better.
Iterative Self-Training for Code Generation via Reinforced Re-Ranking
Computation and Language
Makes computers write better code, faster.