Information Extraction of Clinical Trial Eligibility Criteria
Yitong Tseo, M. I. Salkola, Ahmed Mohamed, Anuj Kumar, Freddy Abnousi

TL;DR
This paper presents a novel information extraction system for clinical trial eligibility criteria, combining machine learning and grammar-based methods to convert complex free-text into structured, computer-interpretable data, improving eligibility determination.
Contribution
It introduces the first application of attention-based CRF for NER and word2vec clustering for NEL in clinical trial criteria extraction, advancing the state-of-the-art.
Findings
System achieves competitive performance with Criteria2Query.
First use of attention-based CRF for NER in this domain.
Effective combination of machine learning and grammar-based approaches.
Abstract
Clinical trials predicate subject eligibility on a diversity of criteria ranging from patient demographics to food allergies. Trials post their requirements as semantically complex, unstructured free-text. Formalizing trial criteria to a computer-interpretable syntax would facilitate eligibility determination. In this paper, we investigate an information extraction (IE) approach for grounding criteria from trials in ClinicalTrials(dot)gov to a shared knowledge base. We frame the problem as a novel knowledge base population task, and implement a solution combining machine learning and context free grammar. To our knowledge, this work is the first criteria extraction system to apply attention-based conditional random field architecture for named entity recognition (NER), and word2vec embedding clustering for named entity linking (NEL). We release the resources and core components of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Semantic Web and Ontologies
