AutoMESC: Automatic Framework for Mining and Classifying Ethereum Smart Contract Vulnerabilities and Their Fixes
Majd Soud, Ilham Qasse, Grischa Liebel, Mohammad Hamdaqa

TL;DR
AutoMESC is an automated framework that mines, classifies, and constructs a comprehensive, up-to-date dataset of Ethereum smart contract vulnerabilities and fixes from GitHub and CVE records, facilitating data-driven security research.
Contribution
It introduces AutoMESC, the first fully automated system for mining and classifying Ethereum smart contract vulnerabilities and their fixes, and provides a publicly available, continuously updated dataset.
Findings
Constructed a dataset with 6.7K vulnerability-fix pairs in Solidity.
Assessed dataset quality in terms of accuracy, provenance, and relevance.
Compared the dataset with existing datasets, demonstrating its comprehensiveness.
Abstract
Due to the risks associated with vulnerabilities in smart contracts, their security has gained significant attention in recent years. However, there is a lack of open datasets on smart contract vulnerabilities and their fixes that allows for data-driven research. Towards this end, we propose an automated method for mining and classifying Ethereum's smart contract vulnerabilities and their corresponding fixes from GitHub and from the Common Vulnerabilities and Exposures (CVE) records in the National Vulnerability Database. We implemented the proposed method in a fully automated framework, which we call AutoMESC. AutoMESC uses seven of the most well-known smart contract security tools to classify and label the collected vulnerabilities based on vulnerability types. Furthermore, it collects metadata that can be used in data-intensive smart contract security research (e.g., vulnerability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlockchain Technology Applications and Security · Cybercrime and Law Enforcement Studies
