Feature engineering vs. deep learning for paper section identification:   Toward applications in Chinese medical literature

Sijia Zhou; Xin Li

arXiv:2412.11125·cs.CL·December 17, 2024

Feature engineering vs. deep learning for paper section identification: Toward applications in Chinese medical literature

Sijia Zhou, Xin Li

PDF

TL;DR

This paper compares traditional machine learning and deep learning approaches for identifying sections in Chinese medical literature, proposing a novel SLSTM model that outperforms existing methods with nearly 90% accuracy.

Contribution

It introduces the Structural Bidirectional LSTM (SLSTM) model for Chinese literature section identification, demonstrating its superiority over traditional and other deep learning methods.

Findings

01

CRFs outperform basic features with classic ML algorithms.

02

Deep learning models are less effective than traditional ML for this task.

03

The SLSTM model achieves nearly 90% precision and recall.

Abstract

Section identification is an important task for library science, especially knowledge management. Identifying the sections of a paper would help filter noise in entity and relation extraction. In this research, we studied the paper section identification problem in the context of Chinese medical literature analysis, where the subjects, methods, and results are more valuable from a physician's perspective. Based on previous studies on English literature section identification, we experiment with the effective features to use with classic machine learning algorithms to tackle the problem. It is found that Conditional Random Fields, which consider sentence interdependency, is more effective in combining different feature sets, such as bag-of-words, part-of-speech, and headings, for Chinese literature section identification. Moreover, we find that classic machine learning algorithms are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLib