STRIDE: Simple Type Recognition In Decompiled Executables
Harrison Green, Edward J. Schwartz, Claire Le Goues, Bogdan Vasilescu

TL;DR
STRIDE is a lightweight, fast, and effective method for predicting variable types and names in decompiled executables, matching the performance of complex models with simpler implementation.
Contribution
We introduce STRIDE, a simple token-matching technique that achieves state-of-the-art results in variable retyping and renaming in decompiled code, with minimal complexity.
Findings
STRIDE performs comparably to advanced machine learning models.
It is faster and simpler to implement than transformer-based approaches.
Open-sourced implementation available for community use.
Abstract
Decompilers are widely used by security researchers and developers to reverse engineer executable code. While modern decompilers are adept at recovering instructions, control flow, and function boundaries, some useful information from the original source code, such as variable types and names, is lost during the compilation process. Our work aims to predict these variable types and names from the remaining information. We propose STRIDE, a lightweight technique that predicts variable names and types by matching sequences of decompiler tokens to those found in training data. We evaluate it on three benchmark datasets and find that STRIDE achieves comparable performance to state-of-the-art machine learning models for both variable retyping and renaming while being much simpler and faster. We perform a detailed comparison with two recent SOTA transformer-based models in order to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Formal Methods in Verification · Business Process Modeling and Analysis
