LASIGE and UNICAGE solution to the NASA LitCoin NLP Competition
Pedro Ruas, Diana F. Sousa, Andr\'e Neves, Carlos Cruz, Francisco M., Couto

TL;DR
This paper presents an integrated biomedical NLP pipeline combining industry data engineering and academic NLP tools, successfully applied in the 2022 LitCoin NLP Challenge, achieving 7th place among 200 teams.
Contribution
It introduces a novel integration of industry and academic NLP components with external biomedical knowledge sources for improved biomedical text processing.
Findings
Achieved 7th place in the LitCoin NLP Challenge
Demonstrated effective integration of industry and academic NLP tools
Provided open-source software for biomedical NLP tasks
Abstract
Biomedical Natural Language Processing (NLP) tends to become cumbersome for most researchers, frequently due to the amount and heterogeneity of text to be processed. To address this challenge, the industry is continuously developing highly efficient tools and creating more flexible engineering solutions. This work presents the integration between industry data engineering solutions for efficient data processing and academic systems developed for Named Entity Recognition (LasigeUnicage\_NER) and Relation Extraction (BiOnt). Our design reflects an integration of those components with external knowledge in the form of additional training data from other datasets and biomedical ontologies. We used this pipeline in the 2022 LitCoin NLP Challenge, where our team LasigeUnicage was awarded the 7th Prize out of approximately 200 participating teams, reflecting a successful collaboration between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
