A systematic review of fuzzing based on machine learning techniques

Yan Wang; Peng Jia; Luping Liu; Jiayong Liu

arXiv:1908.01262·cs.CR·August 20, 2020

A systematic review of fuzzing based on machine learning techniques

Yan Wang, Peng Jia, Luping Liu, Jiayong Liu

PDF

TL;DR

This paper systematically reviews how machine learning techniques have been integrated into fuzzing for vulnerability detection, analyzing models, performance, and future challenges in enhancing fuzzing effectiveness.

Contribution

It provides a comprehensive analysis of machine learning-based fuzzing models, including their methodologies, evaluation, and comparison with traditional fuzzing tools, highlighting recent advancements and limitations.

Findings

01

Machine learning improves fuzzing performance in vulnerability discovery.

02

ML models show acceptable predictive capabilities for fuzzing.

03

Challenges remain in data imbalance and feature extraction for vulnerabilities.

Abstract

Security vulnerabilities play a vital role in network security system. Fuzzing technology is widely used as a vulnerability discovery technology to reduce damage in advance. However, traditional fuzzing techniques have many challenges, such as how to mutate input seed files, how to increase code coverage, and how to effectively bypass verification. Machine learning technology has been introduced as a new method into fuzzing test to alleviate these challenges. This paper reviews the research progress of using machine learning technology for fuzzing test in recent years, analyzes how machine learning improve the fuzz process and results, and sheds light on future work in fuzzing. Firstly, this paper discusses the reasons why machine learning techniques can be used for fuzzing scenarios and identifies six different stages in which machine learning have been used. Then this paper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.