# Chinese Named Entity Recognition for Dairy Cow Diseases by Fusion of Multi-Semantic Features Using Self-Attention-Based Deep Learning

**Authors:** Yongjun Lou, Meng Gao, Shuo Zhang, Hongjun Yang, Sicong Wang, Yongqiang He, Jing Yang, Wenxia Yang, Haitao Du, Weizheng Shen

PMC · DOI: 10.3390/ani15060822 · 2025-03-13

## TL;DR

This paper introduces a deep learning model for recognizing disease-related entities in Chinese dairy cow texts, improving accuracy for building knowledge graphs in the cattle industry.

## Contribution

A novel self-attention-based deep learning model for Chinese NER in dairy cow disease texts using multi-semantic features.

## Key findings

- The proposed model achieved an F1 score of 92.18% on the dairy cow disease corpus.
- Multi-level features (character, pinyin, glyph, lexical) improved entity recognition performance.
- The model outperformed existing baselines for Chinese dairy cow disease named entity recognition.

## Abstract

Building a high-quality knowledge graph of dairy cow diseases is one of the main concerns in the cattle breeding industry; it can serve as a reliable foundation for subsequent applications, including answering disease-related questions and auxiliary diagnosis systems, which can significantly lower the barrier for farmers and dairy farms to access professional knowledge. The named entity recognition (NER) task is crucial for constructing a knowledge graph and aims to extract key information such as disease names and symptoms from textual data, where the disease name and symptom information are referred to as entities. According to the characteristics of Chinese dairy cow disease texts, this study explored a named entity recognition method based on multi-semantic features. The results show that the proposed model achieved good recognition performance. Our work provides a foundation for the effective utilization of dairy cow disease knowledge in practical applications and a new insight for named entity recognition for other animal or crop diseases.

Named entity recognition (NER) is the basic task of constructing a high-quality knowledge graph, which can provide reliable knowledge in the auxiliary diagnosis of dairy cow disease, thus alleviating problems of missed diagnosis and misdiagnosis due to the lack of professional veterinarians in China. Targeting the characteristics of the Chinese dairy cow diseases corpus, we propose an ensemble Chinese NER model incorporating character-level, pinyin-level, glyph-level, and lexical-level features of Chinese characters. These multi-level features were concatenated and fed into the bidirectional long short-term memory (Bi-LSTM) network based on the multi-head self-attention mechanism to learn long-distance dependencies while focusing on important features. Finally, the globally optimal label sequence was obtained by the conditional random field (CRF) model. Experimental results showed that our proposed model outperformed baselines and related works with an F1 score of 92.18%, which is suitable and effective for named entity recognition for the dairy cow disease corpus.

## Full-text entities

- **Diseases:** Dairy Cow Diseases (MESH:D007787)

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11939194/full.md

---
Source: https://tomesphere.com/paper/PMC11939194