Discovery of Endianness and Instruction Size Characteristics in Binary Programs from Unknown Instruction Set Architectures
Joachim Andreassen, Donn Morrison

TL;DR
This paper presents machine learning methods to identify endianness and instruction size characteristics in binary programs from unknown ISAs, aiding reverse engineering and hardware analysis.
Contribution
It introduces feature engineering techniques and models for detecting endianness and instruction width types in unknown ISAs, including fixed and variable instruction sizes.
Findings
Endianness detection accuracy of 99.4%.
Instruction size type detection accuracy of 86.0%.
Fixed instruction size detection accuracy of 88.0%.
Abstract
We study the problem of streamlining reverse engineering (RE) of binary programs from unknown instruction set architectures (ISA). We focus on two fundamental ISA characteristics to beginning the RE process: identification of endianness and whether the instruction width is a fixed or variable. For ISAs with a fixed instruction width, we also present methods for estimating the width. In addition to advancing research in software RE, our work can also be seen as a first step in hardware reverse engineering, because endianness and instruction format describe intrinsic characteristics of the underlying ISA. We detail our efforts at feature engineering and perform experiments using a variety of machine learning models on two datasets of architectures using Leave-One-Group-Out-Cross-Validation to simulate conditions where the tested ISA is unknown during model training. We use bigram-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Machine Learning and Data Classification · Advancements in Photolithography Techniques
