The Inconvenient Truths of Ground Truth for Binary Analysis
Jim Alves-Foss, Varsah Venugopal

TL;DR
This paper critically examines the concept of ground truth in binary analysis, highlighting its variability and importance for accurate evaluation and machine learning training.
Contribution
It challenges the binary analysis community to clarify and standardize the definition of ground truth for more reliable tool assessment.
Findings
Not all ground truths are equivalent in binary analysis.
Misaligned ground truths can lead to misleading evaluations.
Clarifying ground truth is crucial for machine learning effectiveness.
Abstract
The effectiveness of binary analysis tools and techniques is often measured with respect to how well they map to a ground truth. We have found that not all ground truths are created equal. This paper challenges the binary analysis community to take a long look at the concept of ground truth, to ensure that we are in agreement with definition(s) of ground truth, so that we can be confident in the evaluation of tools and techniques. This becomes even more important as we move to trained machine learning models, which are only as useful as the validity of the ground truth in the training.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
