Benchmarking Large Language Models for IoC Recovery under Adversarial Code Obfuscation and Encryption

Jaime Morales; Sergio Pastrana; Juan Tapiador

arXiv:2605.06910·cs.CR·May 11, 2026

Benchmarking Large Language Models for IoC Recovery under Adversarial Code Obfuscation and Encryption

Jaime Morales, Sergio Pastrana, Juan Tapiador

PDF

TL;DR

This paper presents a benchmark to evaluate large language models' ability to detect Indicators of Compromise in obfuscated and encrypted JavaScript code, revealing strengths against simple transformations and limitations against encryption.

Contribution

It introduces a systematic benchmark dataset and evaluation framework for assessing LLMs in IoC recovery under adversarial code transformations, including encryption.

Findings

01

LLMs perform well on simple obfuscations like variable renaming.

02

Encryption-based concealment significantly reduces detection accuracy.

03

The benchmark highlights encryption as a major challenge for automated threat analysis.

Abstract

Software obfuscation and encryption present persistent challenges for program comprehension and security analysis, particularly when adversaries conceal Indicators of Compromise (IoCs) such as IP addresses within source code. While Large Language Models (LLMs) have recently demonstrated remarkable progress in code reasoning and transformation, their resilience against adversarial concealment techniques remains largely uncharted. This paper introduces a systematic benchmark for secret detection under adversarial code transformations, designed to evaluate the capacity of LLMs to recover IoCs embedded in obfuscated and encrypted JavaScript programs. We construct a dataset of 336 programs, progressively transformed through 12 levels of obfuscation and cryptographic concealment (including XOR and AES-256), to emulate realistic threat scenarios. An automated evaluation framework…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.