Deconstructing Obfuscation: A four-dimensional framework for evaluating Large Language Models assembly code deobfuscation capabilities

Anton Tkachenko; Dmitrij Suskevic; Benjamin Adolphi

arXiv:2505.19887·cs.SE·June 6, 2025

Deconstructing Obfuscation: A four-dimensional framework for evaluating Large Language Models assembly code deobfuscation capabilities

Anton Tkachenko, Dmitrij Suskevic, Benjamin Adolphi

PDF

Open Access

TL;DR

This paper evaluates the capabilities of large language models in assembly code deobfuscation, revealing significant performance variability and fundamental limitations, and proposing a four-dimensional framework to understand these variations.

Contribution

It introduces the first comprehensive evaluation of commercial LLMs for assembly deobfuscation and proposes a theoretical framework explaining their performance variations.

Findings

01

LLMs show varying success in deobfuscation tasks.

02

Certain obfuscation techniques remain resistant to LLMs.

03

Fundamental error patterns limit LLM effectiveness.

Abstract

Large language models (LLMs) have shown promise in software engineering, yet their effectiveness for binary analysis remains unexplored. We present the first comprehensive evaluation of commercial LLMs for assembly code deobfuscation. Testing seven state-of-the-art models against four obfuscation scenarios (bogus control flow, instruction substitution, control flow flattening, and their combination), we found striking performance variations--from autonomous deobfuscation to complete failure. We propose a theoretical framework based on four dimensions: Reasoning Depth, Pattern Recognition, Noise Filtering, and Context Integration, explaining these variations. Our analysis identifies five error patterns: predicate misinterpretation, structural mapping errors, control flow misinterpretation, arithmetic transformation errors, and constant propagation errors, revealing fundamental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Advanced Malware Detection Techniques