The Inconvenient Truths of Ground Truth for Binary Analysis

Jim Alves-Foss; Varsah Venugopal

arXiv:2210.15079·cs.CR·October 28, 2022

The Inconvenient Truths of Ground Truth for Binary Analysis

Jim Alves-Foss, Varsah Venugopal

PDF

Open Access

TL;DR

This paper critically examines the concept of ground truth in binary analysis, highlighting its variability and importance for accurate evaluation and machine learning training.

Contribution

It challenges the binary analysis community to clarify and standardize the definition of ground truth for more reliable tool assessment.

Findings

01

Not all ground truths are equivalent in binary analysis.

02

Misaligned ground truths can lead to misleading evaluations.

03

Clarifying ground truth is crucial for machine learning effectiveness.

Abstract

The effectiveness of binary analysis tools and techniques is often measured with respect to how well they map to a ground truth. We have found that not all ground truths are created equal. This paper challenges the binary analysis community to take a long look at the concept of ground truth, to ensure that we are in agreement with definition(s) of ground truth, so that we can be confident in the evaluation of tools and techniques. This becomes even more important as we move to trained machine learning models, which are only as useful as the validity of the ground truth in the training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications