Towards Effective Sentence Simplification for Automatic Processing of Biomedical Text
Siddhartha Jonnalagadda, Luis Tari, Jorg Hakenberg, Chitta Baral and, Graciela Gonzalez

TL;DR
This paper introduces bioSimplify, a sentence simplification method designed to enhance syntactic parsing accuracy of biomedical texts, thereby improving downstream text mining tasks.
Contribution
The paper presents a novel biomedical sentence simplification technique that significantly improves parser performance on complex biomedical sentences.
Findings
Parser accuracy improved by up to 4.23% with simplification.
Simplification benefits extend to multiple syntactic parsers.
Enhanced parsing performance facilitates better biomedical text processing.
Abstract
The complexity of sentences characteristic to biomedical articles poses a challenge to natural language parsers, which are typically trained on large-scale corpora of non-technical text. We propose a text simplification process, bioSimplify, that seeks to reduce the complexity of sentences in biomedical abstracts in order to improve the performance of syntactic parsers on the processed sentences. Syntactic parsing is typically one of the first steps in a text mining pipeline. Thus, any improvement in performance would have a ripple effect over all processing steps. We evaluated our method using a corpus of biomedical sentences annotated with syntactic links. Our empirical results show an improvement of 2.90% for the Charniak-McClosky parser and of 4.23% for the Link Grammar parser when processing simplified sentences rather than the original sentences in the corpus.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling
