Impact of Feature Encoding on Malware Classification Explainability
Elyes Manai, Mohamed Mejri, Jaouhar Fattahi

TL;DR
This study examines how feature encoding methods, specifically Label Encoding and One Hot Encoding, affect the explainability and analysis efficiency of malware classification models using XAI techniques.
Contribution
It demonstrates that One Hot Encoding enhances explainability and reduces analysis time despite slight performance loss compared to Label Encoding.
Findings
OHE provides more detailed explanations in XAI.
OHE results in smaller explanation files and faster analysis.
Performance loss with OHE is marginal.
Abstract
This paper investigates the impact of feature encoding techniques on the explainability of XAI (Explainable Artificial Intelligence) algorithms. Using a malware classification dataset, we trained an XGBoost model and compared the performance of two feature encoding methods: Label Encoding (LE) and One Hot Encoding (OHE). Our findings reveal a marginal performance loss when using OHE instead of LE. However, the more detailed explanations provided by OHE compensated for this loss. We observed that OHE enables deeper exploration of details in both global and local contexts, facilitating more comprehensive answers. Additionally, we observed that using OHE resulted in smaller explanation files and reduced analysis time for human analysts. These findings emphasize the significance of considering feature encoding techniques in XAI research and suggest potential for further exploration by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Advanced Malware Detection Techniques · Explainable Artificial Intelligence (XAI)
