The Rise of GoodFATR: A Novel Accuracy Comparison Methodology for Indicator Extraction Tools
Juan Caballero, Gibran Gomez, Srdjan Matic, Gustavo S\'anchez, and Silvia Sebasti\'an, Arturo Villaca\~nas

TL;DR
This paper introduces GoodFATR, a platform that uses a novel majority vote methodology to compare the accuracy of indicator extraction tools from threat reports without needing a ground truth dataset.
Contribution
The work presents a new accuracy comparison methodology for IOC extraction tools and implements it in an automated platform supporting multiple data sources.
Findings
GoodFATR collected nearly half a million reports over 15 months.
It extracted over 978,000 indicators and identified 618,217 IOCs.
The methodology enables comparison of 7 IOC extraction tools without ground truth.
Abstract
To adapt to a constantly evolving landscape of cyber threats, organizations actively need to collect Indicators of Compromise (IOCs), i.e., forensic artifacts that signal that a host or network might have been compromised. IOCs can be collected through open-source and commercial structured IOC feeds. But, they can also be extracted from a myriad of unstructured threat reports written in natural language and distributed using a wide array of sources such as blogs and social media. There exist multiple indicator extraction tools that can identify IOCs in natural language reports. But, it is hard to compare their accuracy due to the difficulty of building large ground truth datasets. This work presents a novel majority vote methodology for comparing the accuracy of indicator extraction tools, which does not require a manually-built ground truth. We implement our methodology into GoodFATR,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Criteria Decision Making
