Automatic Library Version Identification, an Exploration of Techniques
Thomas Rinsma

TL;DR
This paper explores various binary comparison and fingerprinting techniques for identifying library versions within binaries, implementing six methods in an open-source tool and empirically evaluating their effectiveness on real and artificial samples.
Contribution
It introduces an open-source tool applying six fingerprinting techniques for library version identification and provides an empirical analysis of their effectiveness.
Findings
Readable string-based techniques perform best.
One technique correctly identifies multiple libraries in stripped binaries.
Empirical results validate the effectiveness of certain fingerprinting methods.
Abstract
This paper is the result of a two month research internship on the topic of library version identification. In this paper, ideas and techniques from literature in the area of binary comparison and fingerprinting are outlined and applied to the problem of (version) identification of shared libraries and of libraries within statically linked binary executables. Six comparison techniques are chosen and implemented in an open-source tool which in turn makes use of the open-source radare2 framework for signature generation. The effectiveness of the techniques is empirically analyzed by comparing both artificial and real sample files against a reference dataset of multiple versions of dozens of libraries. The results show that out of these techniques, readable string--based techniques perform the best and that one of these techniques correctly identifies multiple libraries contained in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Digital and Cyber Forensics · Digital Media Forensic Detection
