HELIOS: Hierarchical Graph Abstraction for Structure-Aware LLM Decompilation
Yonatan Gizachew Achamyeleh, Harsh Thomare, Mohammad Abdullah Al Faruque

TL;DR
HELIOS enhances LLM-based binary decompilation by incorporating hierarchical control flow graphs, significantly improving code correctness and compilability without fine-tuning, thus aiding reverse engineering across multiple architectures.
Contribution
HELIOS introduces a hierarchical graph abstraction to structure LLM decompilation, improving code quality and correctness without requiring model fine-tuning.
Findings
Increases binary compilability from 45% to over 85%.
Reduces functional correctness spread across architectures.
Achieves over 94% compilability with compiler feedback.
Abstract
Large language models (LLMs) have recently been applied to binary decompilation, yet they still treat code as plain text and ignore the graphs that govern program control flow. This limitation often yields syntactically fragile and logically inconsistent output, especially for optimized binaries. This paper presents \textsc{HELIOS}, a framework that reframes LLM-based decompilation as a structured reasoning task. \textsc{HELIOS} summarizes a binary's control flow and function calls into a hierarchical text representation that spells out basic blocks, their successors, and high-level patterns such as loops and conditionals. This representation is supplied to a general-purpose LLM, along with raw decompiler output, optionally combined with a compiler-in-the-loop that returns error messages when the generated code fails to build. On HumanEval-Decompile for \texttt{x86\_64},…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Logic, programming, and type systems · Scientific Computing and Data Management
