RIPEx: Extracting malicious IP addresses from security forums using cross-forum learning
Joobin Gharibshah, Evangelos E. Papalexakis, and Michalis Faloutsos

TL;DR
RIPEx is a cross-forum learning system that automatically extracts and classifies malicious IP addresses from security forums, achieving high precision and recall without needing training data for each new forum.
Contribution
It introduces a novel cross-forum knowledge transfer approach for extracting and labeling malicious IP addresses in security forums, eliminating the need for per-forum training data.
Findings
Achieves over 95% precision and 93% recall in IP address identification.
Identifies malicious IPs with 88% precision and 78% recall.
Works effectively across multiple security forums without additional training.
Abstract
Is it possible to extract malicious IP addresses reported in security forums in an automatic way? This is the question at the heart of our work. We focus on security forums, where security professionals and hackers share knowledge and information, and often report misbehaving IP addresses. So far, there have only been a few efforts to extract information from such security forums. We propose RIPEx, a systematic approach to identify and label IP addresses in security forums by utilizing a cross-forum learning method. In more detail, the challenge is twofold: (a) identifying IP addresses from other numerical entities, such as software version numbers, and (b) classifying the IP address as benign or malicious. We propose an integrated solution that tackles both these problems. A novelty of our approach is that it does not require training data for each new forum. Our approach does…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
