TL;DR
MA-COIR introduces a semantic search indexing approach combined with generative models to improve biomedical concept recognition, especially for implicit and complex concepts, with reduced computational needs.
Contribution
It reformulates concept recognition as an indexing-recognition task using semantic indexes and leverages LLMs and synthetic data for low-resource scenarios, advancing biomedical NLP.
Findings
Effective recognition of explicit and implicit concepts
Reduced computational requirements with a pretrained BART model
Improved performance in low-resource settings
Abstract
Recognizing biomedical concepts in the text is vital for ontology refinement, knowledge graph construction, and concept relationship discovery. However, traditional concept recognition methods, relying on explicit mention identification, often fail to capture complex concepts not explicitly stated in the text. To overcome this limitation, we introduce MA-COIR, a framework that reformulates concept recognition as an indexing-recognition task. By assigning semantic search indexes (ssIDs) to concepts, MA-COIR resolves ambiguities in ontology entries and enhances recognition efficiency. Using a pretrained BART-based model fine-tuned on small datasets, our approach reduces computational requirements to facilitate adoption by domain experts. Furthermore, we incorporate large language models (LLMs)-generated queries and synthetic data to improve recognition in low-resource settings.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsOntology · Hyper-parameter optimization
