Grammatical Case Based IS-A Relation Extraction with Boosting for Polish
Pawe{\l} {\L}ozi\'nski, Dariusz Czerski, Mieczys{\l}aw A. K{\l}opotek

TL;DR
This paper introduces a novel method for extracting IS-A relations from Polish text using grammatical case and morpho-syntactic annotations, along with a boosting technique to enhance extraction coverage, tested on a large web corpus.
Contribution
It presents a new approach leveraging grammatical case for IS-A relation extraction and introduces pseudo-subclass boosting to improve relation extraction recall.
Findings
Effective extraction of IS-A relations from Polish web data.
Boosting increases the number of extracted relations.
Method outperforms traditional pattern-based approaches.
Abstract
Pattern-based methods of IS-A relation extraction rely heavily on so called Hearst patterns. These are ways of expressing instance enumerations of a class in natural language. While these lexico-syntactic patterns prove quite useful, they may not capture all taxonomical relations expressed in text. Therefore in this paper we describe a novel method of IS-A relation extraction from patterns, which uses morpho-syntactical annotations along with grammatical case of noun phrases that constitute entities participating in IS-A relation. We also describe a method for increasing the number of extracted relations that we call pseudo-subclass boosting which has potential application in any pattern-based relation extraction method. Experiments were conducted on a corpus of about 0.5 billion web documents in Polish language.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
