AVMiner: Expansible and Semantic-Preserving Anti-Virus Labels Mining   Method

Ligeng Chen; Zhongling He; Hao Wu; Yuhang Gong; Bing Mao

arXiv:2208.14221·cs.CR·August 31, 2022

AVMiner: Expansible and Semantic-Preserving Anti-Virus Labels Mining Method

Ligeng Chen, Zhongling He, Hao Wu, Yuhang Gong, Bing Mao

PDF

Open Access

TL;DR

AVMiner is an expandable, semantic-preserving system that automatically extracts and ranks vital malware-related tokens from AV labels using NLP and clustering, enhancing malware diagnosis without expert knowledge.

Contribution

It introduces AVMiner, a novel system that automatically mines and ranks important tokens from AV labels, capable of self-updating and not relying on expert knowledge.

Findings

01

Outperforms previous methods on large datasets

02

Successfully extracts vital malware tokens

03

Self-updates with new samples

Abstract

With the increase in the variety and quantity of malware, there is an urgent need to speed up the diagnosis and the analysis of malware. Extracting the malware family-related tokens from AV (Anti-Virus) labels, provided by online anti-virus engines, paves the way for pre-diagnosing the malware. Automatically extract the vital information from AV labels will greatly enhance the detection ability of security enterprises and equip the research ability of security analysts. Recent works like AVCLASS and AVCLASS2 try to extract the attributes of malware from AV labels and establish the taxonomy based on expert knowledge. However, due to the uncertain trend of complicated malicious behaviors, the system needs the following abilities to face the challenge: preserving vital semantics, being expansible, and free from expert knowledge. In this work, we present AVMiner, an expansible malware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Spam and Phishing Detection