# TCMSF: A Construction Framework of Traditional Chinese Medicine Syndrome Ancient Book Knowledge Graph

**Authors:** Ziling Zeng, Lin Tong, Bing Li, Wenjing Zong, Qikai Niu, Sihong Liu, Lei Zhang, Jialun Wang, Siqi Zhang, Siwei Tian, Jing'ai Wang, Wei Zhang, Huamin Zhang

PMC · DOI: 10.1055/a-2590-6348 · Methods of Information in Medicine · 2025-05-15

## TL;DR

This paper introduces a framework for building a structured knowledge graph from ancient Chinese medicine texts to better organize syndrome knowledge.

## Contribution

The novel TCMSF framework improves knowledge graph construction accuracy for traditional Chinese medicine syndrome texts using pretrained models and rule-based methods.

## Key findings

- The TCMSF framework achieved an average F1 score of 0.77 for entity extraction in Yin deficiency syndrome.
- The proposed relationship extraction method reduced incorrectly connected relationships compared to fully connected pattern layers.
- A knowledge graph with over 120,000 entities and 1.18 million relationships was successfully constructed for Yin deficiency syndrome.

## Abstract

Syndrome is a unique and crucial concept in traditional Chinese medicine (TCM). However, much of the syndrome knowledge lacks systematic organization and correlation, and current information technologies are unsuitable for TCM ancient texts.

We aimed to develop a knowledge graph that presents this knowledge in a more orderly, structured, and semantically oriented manner, providing a foundation for computer-aided diagnosis and treatment.

We developed a construction framework of TCM syndrome knowledge from ancient books, using a pretrained model and rules (TCMSF). We conducted fine-tuning training on Enhanced Representation through Knowledge Integration (ERNIE), Bidirectional Encoder Representation from Transformers pretrained language models, and chatGLM3–6b large language models for named entity recognition (NER) tasks. Furthermore, we employed the progressive entity relationship extraction method based on the dual pattern feature combination to extract and standardize entities and relationships between entities in these books.

We selected Yin deficiency syndrome as a case study and constructed a model layer suitable for the expression of knowledge in these books. Compared with multiple NER methods, the combination of ERNIE and Conditional Random Fields performs the best. By utilizing this combination, we completed the entity extraction of Yin deficiency syndrome, achieving an average F1 value of 0.77. The relationship extraction method we proposed reduces the number of incorrectly connected relationships compared with fully connected pattern layers. We successfully constructed a knowledge graph of ancient books on Yin deficiency syndrome, including over 120,000 entities and over 1.18 million relationships.

We developed TCMSF in line with the knowledge characteristics of ancient TCM books and improved the accuracy of knowledge graph construction.

## Full-text entities

- **Diseases:** TCM syndrome (MESH:C562377), Yin deficiency syndrome (MESH:D016710)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12196822/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12196822/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/PMC12196822/full.md

---
Source: https://tomesphere.com/paper/PMC12196822