JsDeObsBench: Measuring and Benchmarking LLMs for JavaScript Deobfuscation
Guoqiang Chen, Xin Jin, Zhiqiang Lin

TL;DR
This paper introduces JsDeObsBench, a comprehensive benchmark for evaluating large language models' effectiveness in JavaScript deobfuscation, highlighting their strengths and limitations in security-related scenarios.
Contribution
The paper presents a systematic benchmark and methodology for assessing LLMs in JS deobfuscation, filling a critical gap in security research.
Findings
LLMs like GPT-4o and Mixtral outperform baselines in code simplification
Challenges remain in syntax accuracy and execution reliability
LLMs show promise in deobfuscating JS malware
Abstract
Deobfuscating JavaScript (JS) code poses a significant challenge in web security, particularly as obfuscation techniques are frequently used to conceal malicious activities within scripts. While Large Language Models (LLMs) have recently shown promise in automating the deobfuscation process, transforming detection and mitigation strategies against these obfuscated threats, a systematic benchmark to quantify their effectiveness and limitations has been notably absent. To address this gap, we present JsDeObsBench, a dedicated benchmark designed to rigorously evaluate the effectiveness of LLMs in the context of JS deobfuscation. We detail our benchmarking methodology, which includes a wide range of obfuscation techniques ranging from basic variable renaming to sophisticated structure transformations, providing a robust framework for assessing LLM performance in real-world scenarios. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Web Application Security Vulnerabilities
