# LEOPARD: Identifying Vulnerable Code for Vulnerability Assessment   through Program Metrics

**Authors:** Xiaoning Du, Bihuan Chen, Yuekang Li, Jianmin Guo, Yaqin Zhou, Yang, Liu, Yu Jiang

arXiv: 1901.11479 · 2020-01-22

## TL;DR

LEOPARD is a lightweight, metric-based framework that identifies potentially vulnerable code functions without prior vulnerability knowledge, outperforming existing methods and discovering new bugs in real-world projects.

## Contribution

It introduces a novel, generic framework combining complexity and vulnerability metrics to detect vulnerable code without prior vulnerability data.

## Key findings

- Covers 74% of vulnerable functions by identifying 20% as vulnerable
- Outperforms machine learning and static analysis techniques
- Discovered 22 new bugs, including 8 new vulnerabilities in real applications

## Abstract

Identifying potentially vulnerable locations in a code base is critical as a pre-step for effective vulnerability assessment; i.e., it can greatly help security experts put their time and effort to where it is needed most. Metric-based and pattern-based methods have been presented for identifying vulnerable code. The former relies on machine learning and cannot work well due to the severe imbalance between non-vulnerable and vulnerable code or lack of features to characterize vulnerabilities. The latter needs the prior knowledge of known vulnerabilities and can only identify similar but not new types of vulnerabilities.   In this paper, we propose and implement a generic, lightweight and extensible framework, LEOPARD, to identify potentially vulnerable functions through program metrics. LEOPARD requires no prior knowledge about known vulnerabilities. It has two steps by combining two sets of systematically derived metrics. First, it uses complexity metrics to group the functions in a target application into a set of bins. Then, it uses vulnerability metrics to rank the functions in each bin and identifies the top ones as potentially vulnerable. Our experimental results on 11 real-world projects have demonstrated that, LEOPARD can cover 74.0% of vulnerable functions by identifying 20% of functions as vulnerable and outperform machine learning-based and static analysis-based techniques. We further propose three applications of LEOPARD for manual code review and fuzzing, through which we discovered 22 new bugs in real applications like PHP, radare2 and FFmpeg, and eight of them are new vulnerabilities.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.11479/full.md

## Figures

25 figures with captions in the complete paper: https://tomesphere.com/paper/1901.11479/full.md

## References

82 references — full list in the complete paper: https://tomesphere.com/paper/1901.11479/full.md

---
Source: https://tomesphere.com/paper/1901.11479