LLM-FS: Zero-Shot Feature Selection for Effective and Interpretable Malware Detection

Naveen Gill; Ajvad Haneef K; Madhu Kumar S D

arXiv:2602.09634·cs.LG·February 11, 2026

LLM-FS: Zero-Shot Feature Selection for Effective and Interpretable Malware Detection

Naveen Gill, Ajvad Haneef K, Madhu Kumar S D

PDF

Open Access

TL;DR

This paper explores using large language models in a zero-shot setting to guide feature selection for malware detection, achieving competitive results with enhanced interpretability and stability compared to traditional methods.

Contribution

It introduces a novel zero-shot LLM-guided feature selection approach for malware detection, emphasizing interpretability and performance without relying on labeled data.

Findings

01

LLM-guided FS achieves competitive accuracy and metrics.

02

Zero-shot LLM FS offers improved interpretability and stability.

03

Reduces dependence on labeled data for feature selection.

Abstract

Feature selection (FS) remains essential for building accurate and interpretable detection models, particularly in high-dimensional malware datasets. Conventional FS methods such as Extra Trees, Variance Threshold, Tree-based models, Chi-Squared tests, ANOVA, Random Selection, and Sequential Attention rely primarily on statistical heuristics or model-driven importance scores, often overlooking the semantic context of features. Motivated by recent progress in LLM-driven FS, we investigate whether large language models (LLMs) can guide feature selection in a zero-shot setting, using only feature names and task descriptions, as a viable alternative to traditional approaches. We evaluate multiple LLMs (GPT-5.0, GPT-4.0, Gemini-2.5 etc.) on the EMBOD dataset (a fusion of EMBER and BODMAS benchmark datasets), comparing them against established FS methods across several classifiers, including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning