Adapting a general parser to a sublanguage
Sophie Aubin (LIPN), Adeline Nazarenko (LIPN), Claire N\'edellec (MIG)

TL;DR
This paper presents a method to adapt a general parser for biological sublanguages by leveraging terminology identification, text normalization, lexicon enhancements, and grammar rule adaptation to improve parsing accuracy.
Contribution
It introduces a comprehensive approach combining multiple strategies to tailor a general parser for specialized biological texts, enhancing parsing performance.
Findings
Improved parsing accuracy after adaptation
Effective use of terminology identification in parsing
Enhanced parser performance with combined strategies
Abstract
In this paper, we propose a method to adapt a general parser (Link Parser) to sublanguages, focusing on the parsing of texts in biology. Our main proposal is the use of terminology (identication and analysis of terms) in order to reduce the complexity of the text to be parsed. Several other strategies are explored and finally combined among which text normalization, lexicon and morpho-guessing module extensions and grammar rules adaptation. We compare the parsing results before and after these adaptations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Biomedical Text Mining and Ontologies · Semantic Web and Ontologies
