Reward Augmentation in Reinforcement Learning for Testing Distributed Systems
Andrea Borgarelli, Constantin Enea, Rupak Majumdar, Srinidhi Nagendra

TL;DR
This paper introduces a reinforcement learning-based testing method for distributed protocols, using reward augmentation techniques like exploration bonuses and semantic waypoints to improve bug detection and coverage.
Contribution
It presents novel reward augmentation techniques for RL-based testing, including state-based bonuses and semantic waypoints, to enhance exploration in distributed system testing.
Findings
Outperforms baseline methods in coverage and bug detection
Effective exploration with reward augmentation techniques
Applicable to large distributed system benchmarks
Abstract
Bugs in popular distributed protocol implementations have been the source of many downtimes in popular internet services. We describe a randomized testing approach for distributed protocol implementations based on reinforcement learning. Since the natural reward structure is very sparse, the key to successful exploration in reinforcement learning is reward augmentation. We show two different techniques that build on one another. First, we provide a decaying exploration bonus based on the discovery of new states -- the reward decays as the same state is visited multiple times. The exploration bonus captures the intuition from coverage-guided fuzzing of prioritizing new coverage points; in contrast to other schemes, we show that taking the maximum of the bonus and the Q-value leads to more effective exploration. Second, we provide waypoints to the algorithm as a sequence of predicates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
