# EDAER: Entropy-Driven Approach for Entity and Relation Extraction in Chinese Cyber Threat Intelligence

**Authors:** Yong Li, Xiuping Li, Yangbai Zhang, Zhiqiang Liu, Xiaowei Li, Qi Xu, Xiaolin Chang

PMC · DOI: 10.3390/e28030261 · Entropy · 2026-02-27

## TL;DR

This paper introduces a new method called EDAER to improve entity and relation extraction in Chinese cyber threat intelligence, especially in low-resource scenarios.

## Contribution

The paper introduces a new Chinese CTI dataset and an entropy-driven approach combining multiple models for improved NER and RE performance.

## Key findings

- RoBERTa_wwm outperforms BERT in both NER and RE tasks.
- Mamba performs better than BiLSTM in NER tasks.
- Entropy-based mechanisms and contrastive learning improve model performance in low-resource scenarios.

## Abstract

Cyber threat intelligence (CTI) has been explored to strengthen system security via taking raw threat data from various data sources and transforming it into actionable insights that enable organizations to predict, detect, and respond to cyber threats. Named entity recognition (NER) and relation extraction (RE) are the key tasks of CTI data mining. However, current CTI NER and/or RE research is mainly focused on English CTI, which is not directly transferable to Chinese CTI due to fundamental linguistic and terminological differences. Moreover, the existing limited studies on Chinese CTI do not effectively address uncertainty in predictions in low-resource scenarios where entities and relations are sparse. This work aims to improve the performance of NER and RE tasks in low-resource Chinese CTI scenarios, and we make two major contributions. The first is that we construct a Chinese CTI dataset, which includes 16 types of entities and 9 types of relations—more than those of the existing open-source dataset on Chinese CTI. The second is that we propose an entropy-driven approach for entity and relation (EDAER) extraction. EDAER is the first to combine the techniques of RoBERTa_wwm, Mamba, RDCNN and CRF to perform NER tasks. In addition, EDAER is the first to apply entropy to quantify the uncertainty of the model’s predictions in NER and RE tasks in Chinese CTI scenarios. Moreover, EDAER is the first to apply contrastive learning techniques in Chinese CTI scenarios to learn meaningful features by maximizing the similarity between positive samples and minimizing the similarity between negative samples. Extensive experimental results on public and our built datasets demonstrate that our proposed approach performs the best. These results show that (1) RoBERTa_wwwm significantly outperforms BERT on both NER and RE tasks; (2) Mamba outperforms BiLSTM on the NER task; (3) the entropy-based dynamic gating mechanism contributes to performance improvements in both NER and RE tasks; and (4) the uncertainty-guided contrastive learning mechanism is helpful for performance improvement in the NER task.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13024763/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13024763/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC13024763/full.md

---
Source: https://tomesphere.com/paper/PMC13024763