LLMs as Firmware Experts: A Runtime-Grown Tree-of-Agents Framework

Xiangrui Zhang; Zeyu Chen; Haining Wang; Qiang Li

arXiv:2511.18438·cs.CR·November 25, 2025

LLMs as Firmware Experts: A Runtime-Grown Tree-of-Agents Framework

Xiangrui Zhang, Zeyu Chen, Haining Wang, Qiang Li

PDF

Open Access

TL;DR

FIRMHIVE leverages a recursive Tree of Agents framework to enhance LLM-based firmware security analysis, enabling deeper, broader, and more accurate vulnerability detection in complex firmware images.

Contribution

This paper introduces FIRMHIVE, a novel recursive agent hive framework that improves LLM performance in firmware security analysis through decentralized coordination and executable delegation.

Findings

01

Performs 16x more reasoning steps than baselines

02

Inspects 2.3x more files, increasing alert yield

03

Detects 1.5x more vulnerabilities with 71% precision

Abstract

Large Language Models (LLMs) and their agent systems have recently demonstrated strong potential in automating code reasoning and vulnerability detection. However, when applied to large-scale firmware, their performance degrades due to the binary nature of firmware, complex dependency structures, and heterogeneous components. To address this challenge, this paper presents FIRMHIVE, a recursive agent hive that enables LLMs to act as autonomous firmware security analysts. FIRMHIVE introduces two key mechanisms: (1) transforming delegation into a per-agent, executable primitive and (2) constructing a runtime Tree of Agents (ToA) for decentralized coordination. We evaluate FIRMHIVE using real-world firmware images obtained from publicly available datasets, covering five representative security analysis tasks. Compared with existing LLM-agent baselines, FIRMHIVE performs deeper (about 16x…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Security and Verification in Computing