LLM4CodeRE: Generative AI for Code Decompilation Analysis and Reverse Engineering

Hamed Jelodar; Samita Bai; Tochukwu Emmanuel Nwankwo; Parisa Hamedi; Mohammad Meymani; Roozbeh Razavi-Far; Ali A. Ghorbani

arXiv:2604.06095·cs.CR·April 8, 2026

LLM4CodeRE: Generative AI for Code Decompilation Analysis and Reverse Engineering

Hamed Jelodar, Samita Bai, Tochukwu Emmanuel Nwankwo, Parisa Hamedi, Mohammad Meymani, Roozbeh Razavi-Far, Ali A. Ghorbani

PDF

TL;DR

LLM4CodeRE is a domain-adaptive large language model framework designed for bidirectional code decompilation and translation, specifically targeting malware reverse engineering with improved accuracy over existing tools.

Contribution

It introduces two novel fine-tuning strategies for domain adaptation, enabling effective assembly-source code translation within a unified model.

Findings

01

Outperforms existing decompilation tools and general-purpose models.

02

Achieves robust bidirectional generalization.

03

Supports both assembly-to-source and source-to-assembly translation.

Abstract

Code decompilation analysis is a fundamental yet challenging task in malware reverse engineering, particularly due to the pervasive use of sophisticated obfuscation techniques. Although recent large language models (LLMs) have shown promise in translating low-level representations into high-level source code, most existing approaches rely on generic code pretraining and lack adaptation to malicious software. We propose LLM4CodeRE, a domain-adaptive LLM framework for bidirectional code reverse engineering that supports both assembly-to-source decompilation and source-to-assembly translation within a unified model. To enable effective task adaptation, we introduce two complementary fine-tuning strategies: (i) a Multi-Adapter approach for task-specific syntactic and semantic alignment, and (ii) a Seq2Seq Unified approach using task-conditioned prefixes to enforce end-to-end generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.