The Rise of GoodFATR: A Novel Accuracy Comparison Methodology for   Indicator Extraction Tools

Juan Caballero; Gibran Gomez; Srdjan Matic; Gustavo S\'anchez; and Silvia Sebasti\'an; Arturo Villaca\~nas

arXiv:2208.00042·cs.CR·March 9, 2023·5 cites

The Rise of GoodFATR: A Novel Accuracy Comparison Methodology for Indicator Extraction Tools

Juan Caballero, Gibran Gomez, Srdjan Matic, Gustavo S\'anchez, and Silvia Sebasti\'an, Arturo Villaca\~nas

PDF

Open Access 1 Repo

TL;DR

This paper introduces GoodFATR, a platform that uses a novel majority vote methodology to compare the accuracy of indicator extraction tools from threat reports without needing a ground truth dataset.

Contribution

The work presents a new accuracy comparison methodology for IOC extraction tools and implements it in an automated platform supporting multiple data sources.

Findings

01

GoodFATR collected nearly half a million reports over 15 months.

02

It extracted over 978,000 indicators and identified 618,217 IOCs.

03

The methodology enables comparison of 7 IOC extraction tools without ground truth.

Abstract

To adapt to a constantly evolving landscape of cyber threats, organizations actively need to collect Indicators of Compromise (IOCs), i.e., forensic artifacts that signal that a host or network might have been compromised. IOCs can be collected through open-source and commercial structured IOC feeds. But, they can also be extracted from a myriad of unstructured threat reports written in natural language and distributed using a wide array of sources such as blogs and social media. There exist multiple indicator extraction tools that can identify IOCs in natural language reports. But, it is hard to compare their accuracy due to the difficulty of building large ground truth datasets. This work presents a novel majority vote methodology for comparing the accuracy of indicator extraction tools, which does not require a manually-built ground truth. We implement our methodology into GoodFATR,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

malicialab/iocsearcher
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Criteria Decision Making