TIJERE: A Novel Threat Intelligence Joint Extraction Model Based on Analyst Expert Knowledge
Inoussa Mouiche, Sherif Saad

TL;DR
TIJERE is a new joint extraction framework that uses expert knowledge and a specialized language model to improve cybersecurity entity and relation extraction accuracy.
Contribution
The paper introduces TIJERE, a multisequence labeling approach leveraging expert features and SecureBERT+ for enhanced cybersecurity information extraction.
Findings
Achieves F1-scores over 0.93 for NER and 0.98 for RE.
Introduces DNRTI-JE, the first publicly available cybersecurity joint labeling dataset.
Outperforms existing extraction methods on the DNRTI-JE dataset.
Abstract
The extraction of entities and relationships from threat intelligence reports into structured formats, such as cybersecurity knowledge graphs, is essential for automated threat analysis, detection, and mitigation. However, existing joint extraction methods struggle with feature confusion, language ambiguity, noise propagation, and overlapping relations, resulting in low accuracy and poor model performance. This paper presents TIJERE, an innovative joint entity and relation extraction framework that formulates joint extraction as a multisequence labeling representation (MSLR) problem. Specifically, separate sequences are generated for each entity pair. Unlike prior tagging schemes, MSLR integrates expert domain features to enrich positional, contextual, and semantic representations of entities, thereby enhancing feature distinction and classification accuracy. Additionally, TIJERE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
