A Corpus Study and Annotation Schema for Named Entity Recognition and Relation Extraction of Business Products
Saskia Sch\"on, Veselina Mironova, Aleksandra Gabryszak, Leonhard, Hennig

TL;DR
This paper introduces a new annotated corpus, schema, and guidelines for recognizing business product entities and relations in noisy web and social media texts, addressing a gap in existing resources.
Contribution
It provides a novel annotation schema and guidelines for non-standard business entities and relations, along with a preliminary annotated corpus for this domain.
Findings
Product mentions are often noun phrases with boundary ambiguity.
High syntactic and semantic variability complicates annotation.
Preliminary corpus demonstrates feasibility of the approach.
Abstract
Recognizing non-standard entity types and relations, such as B2B products, product classes and their producers, in news and forum texts is important in application areas such as supply chain monitoring and market research. However, there is a decided lack of annotated corpora and annotation guidelines in this domain. In this work, we present a corpus study, an annotation schema and associated guidelines, for the annotation of product entity and company-product relation mentions. We find that although product mentions are often realized as noun phrases, defining their exact extent is difficult due to high boundary ambiguity and the broad syntactic and semantic variety of their surface realizations. We also describe our ongoing annotation effort, and present a preliminary corpus of English web and social media documents annotated according to the proposed guidelines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Web Data Mining and Analysis
