Vendor-Conditioned Contrastive Learning for Predicting Organizational Cyber Threat Targets
Benjamin M. Ampel

TL;DR
This paper introduces TRACE, a contrastive learning framework that predicts organizational targets of cyber exploits using large-scale, multi-source data, outperforming traditional methods especially under temporal shifts.
Contribution
The paper presents a novel vendor-conditioned contrastive learning approach leveraging extensive multi-source data for improved cyber threat target prediction.
Findings
TRACE achieves macro F1=97.00% in temporal out-of-distribution evaluation.
Outperforms 17 benchmark classical ML and deep learning models.
Utilizes a large dataset of 352,866 posts over three decades.
Abstract
Cyberattacks cause billions of dollars in damage annually, with malicious hackers often sharing exploit code and techniques on underground forums. Identifying which organizations are targeted by these exploits is critical for proactive Cyber Threat Intelligence (CTI). To address that gap, we propose Temporal Representation and Classification of Exploits (TRACE), a vendor-conditioned contrastive learning framework built on CySecBERT that jointly optimizes organizational target classification and vendor-coherent representations while evaluating robustness under temporal distribution shift. Unlike prior work limited to small, single-source datasets, we leverage a large-scale, multi-source corpus spanning 9 exploit databases and hacker forums, comprising 352,866 posts collected over three decades, yielding a 129,126-sample dataset across seven organizational categories. In the temporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
