Generating Fake Cyber Threat Intelligence Using Transformer-Based Models

Priyanka Ranade; Aritran Piplai; Sudip Mittal; Anupam Joshi; Tim Finin

arXiv:2102.04351·cs.CR·June 22, 2021

Generating Fake Cyber Threat Intelligence Using Transformer-Based Models

Priyanka Ranade, Aritran Piplai, Sudip Mittal, Anupam Joshi, Tim Finin

PDF

TL;DR

This paper demonstrates how transformer-based models can generate convincing fake cyber threat intelligence (CTI), which can be used to poison cybersecurity knowledge graphs and deceive professionals, highlighting security risks.

Contribution

It introduces a method to generate realistic fake CTI using transformers and shows its effectiveness in poisoning cyber defense systems and deceiving experts.

Findings

01

Fake CTI can be generated convincingly using GPT-2 with fine-tuning.

02

Generated fake CTI successfully poisons cybersecurity knowledge graphs.

03

Cybersecurity professionals often mistake fake CTI for real information.

Abstract

Cyber-defense systems are being developed to automatically ingest Cyber Threat Intelligence (CTI) that contains semi-structured data and/or text to populate knowledge graphs. A potential risk is that fake CTI can be generated and spread through Open-Source Intelligence (OSINT) communities or on the Web to effect a data poisoning attack on these systems. Adversaries can use fake CTI examples as training input to subvert cyber defense systems, forcing the model to learn incorrect inputs to serve their malicious needs. In this paper, we automatically generate fake CTI text descriptions using transformers. We show that given an initial prompt sentence, a public language model like GPT-2 with fine-tuning, can generate plausible CTI text with the ability of corrupting cyber-defense systems. We utilize the generated fake CTI text to perform a data poisoning attack on a Cybersecurity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Cosine Annealing · Layer Normalization · Residual Connection · Attention Dropout · Discriminative Fine-Tuning · Multi-Head Attention · Adam · Linear Warmup With Cosine Annealing · Weight Decay