Attribution-Driven Explainable Intrusion Detection with Encoder-Based Large Language Models
Umesh Biswas, Shafqat Hasan, Syed Mohammed Farhan, Nisha Pillai, and Charan Gudla

TL;DR
This paper uses attribution analysis on encoder-based LLMs to improve transparency and trust in network intrusion detection within SDN, showing that models learn meaningful attack patterns from traffic data.
Contribution
It introduces an attribution-driven approach to analyze encoder-based LLMs, enhancing interpretability and validation in SDN intrusion detection.
Findings
Model decisions are driven by meaningful traffic behavior patterns.
Patterns align with established intrusion detection principles.
Attribution methods validate LLMs' understanding of attack behaviors.
Abstract
Software-Defined Networking (SDN) improves network flexibility but also increases the need for reliable and interpretable intrusion detection. Large Language Models (LLMs) have recently been explored for cybersecurity tasks due to their strong representation learning capabilities; however, their lack of transparency limits their practical adoption in security-critical environments. Understanding how LLMs make decisions is therefore essential. This paper presents an attribution-driven analysis of encoder-based LLMs for network intrusion detection using flow-level traffic features. Attribution analysis demonstrates that model decisions are driven by meaningful traffic behavior patterns, improving transparency and trust in transformer-based SDN intrusion detection. These patterns align with established intrusion detection principles, indicating that LLMs learn attack behavior from traffic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
