Automated Detection of Clinical Entities in Lung and Breast Cancer Reports Using NLP Techniques
J. Moreno-Casanova, J.M. Au\~n\'on, A. M\'artinez-P\'erez, M.E. P\'erez-Mart\'inez, M.E. Gas-L\'opez

TL;DR
This study applies NLP techniques, specifically fine-tuned RoBERTa models, to automatically extract clinical entities from lung and breast cancer reports, improving data extraction efficiency from electronic health records.
Contribution
It introduces a novel application of fine-tuned biomedical NLP models for clinical entity recognition in Spanish cancer reports, utilizing a specialized dataset and a state-of-the-art transformer architecture.
Findings
High accuracy in identifying key entities like MET and PAT.
Challenges remain in recognizing less frequent entities such as EVOL.
Effective use of NLP enhances data extraction from clinical reports.
Abstract
Research projects, including those focused on cancer, rely on the manual extraction of information from clinical reports. This process is time-consuming and prone to errors, limiting the efficiency of data-driven approaches in healthcare. To address these challenges, Natural Language Processing (NLP) offers an alternative for automating the extraction of relevant data from electronic health records (EHRs). In this study, we focus on lung and breast cancer due to their high incidence and the significant impact they have on public health. Early detection and effective data management in both types of cancer are crucial for improving patient outcomes. To enhance the accuracy and efficiency of data extraction, we utilized GMV's NLP tool uQuery, which excels at identifying relevant entities in clinical texts and converting them into standardized formats such as SNOMED and OMOP. uQuery not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Advanced Text Analysis Techniques
MethodsFocus
