Extracting Connected Concepts from Biomedical Texts using Fog Index
Rushdi Shams, Robert E. Mercer

TL;DR
This paper introduces a method using the Fog Index to identify sentences with connected biomedical concepts, enhancing the extraction of relevant concept pairs from scientific texts.
Contribution
We propose a novel approach combining Fog Index filtering and an association matrix to improve connected concept extraction in biomedical literature.
Findings
Sentences with both concepts tend to have lower readability scores.
The combined filtering method effectively reduces irrelevant concept pairs.
Experimental results show improved accuracy in identifying meaningful concept connections.
Abstract
In this paper, we establish Fog Index (FI) as a text filter to locate the sentences in texts that contain connected biomedical concepts of interest. To do so, we have used 24 random papers each containing four pairs of connected concepts. For each pair, we categorize sentences based on whether they contain both, any or none of the concepts. We then use FI to measure difficulty of the sentences of each category and find that sentences containing both of the concepts have low readability. We rank sentences of a text according to their FI and select 30 percent of the most difficult sentences. We use an association matrix to track the most frequent pairs of concepts in them. This matrix reports that the first filter produces some pairs that hold almost no connections. To remove these unwanted pairs, we use the Equally Weighted Harmonic Mean of their Positive Predictive Value (PPV) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Advanced Text Analysis Techniques
