Trident: Improving Malware Detection with LLMs and Behavioral Features
Rebecca Saul, Jingzhi Jiang, Elliott Chia, and David Wagner

TL;DR
Trident is a malware detection system that combines static features, behavior-based rules generated by LLMs, and direct LLM analysis of sandbox reports to improve robustness and reduce false positives.
Contribution
The paper introduces Trident, a novel malware detection system that integrates static features, LLM-generated behavioral rules, and sandbox report analysis for enhanced detection.
Findings
Trident outperforms static feature-based methods in detection accuracy.
Behavioral rules derived from LLMs are more robust to concept drift.
Trident maintains low false positive rates while improving resilience to evolving malware.
Abstract
Traditionally, machine learning methods for PE malware detection have relied on static features like byte histograms, string information, and PE header contents. One barrier to incorporating dynamic analysis features has been the semi-structured nature of sandbox behavior reports. We show that, using the latest generation of large language models with reasoning, it is possible to efficiently process these behavior reports and utilize them as part of a malware detection pipeline. Specifically, we leverage LLMs to generate behavior-based malware detection rules based on a small training set of labeled malware. We find that these detection rules, derived from behavioral features, are much more robust to concept drift than standard static-feature methods, while maintaining practical false positive rates. Finally, we introduce Trident, a system which combines a classic decision tree model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
