Deep Reinforcement Fuzzing

Konstantin B\"ottinger; Patrice Godefroid; Rishabh Singh

arXiv:1801.04589·cs.AI·January 16, 2018

Deep Reinforcement Fuzzing

Konstantin B\"ottinger, Patrice Godefroid, Rishabh Singh

PDF

TL;DR

This paper introduces a novel deep reinforcement learning approach to fuzzing, formalizing it as a Markov decision process and demonstrating its potential to outperform traditional random fuzzing methods.

Contribution

It is the first to formalize fuzzing as a reinforcement learning problem and apply deep Q-learning to optimize input generation for vulnerability discovery.

Findings

01

Reinforcement fuzzing can outperform baseline random fuzzing.

02

The approach learns policies that generate higher-reward inputs.

03

Preliminary results show promising improvements in vulnerability detection.

Abstract

Fuzzing is the process of finding security vulnerabilities in input-processing code by repeatedly testing the code with modified inputs. In this paper, we formalize fuzzing as a reinforcement learning problem using the concept of Markov decision processes. This in turn allows us to apply state-of-the-art deep Q-learning algorithms that optimize rewards, which we define from runtime properties of the program under test. By observing the rewards caused by mutating with a specific set of actions performed on an initial program input, the fuzzing agent learns a policy that can next generate new higher-reward inputs. We have implemented this new approach, and preliminary empirical evidence shows that reinforcement fuzzing can outperform baseline random fuzzing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning