A Corpus-based Evaluation of Lexical Components of a Domainspecific Text to Knowledge Mapping Prototype
Rushdi Shams, Adel Elsayed

TL;DR
This paper evaluates a domain-specific Text to Knowledge Mapping prototype for physics by developing and using a representative corpus to assess and enhance its lexical components and parsing capabilities.
Contribution
It introduces a corpus-based evaluation method for a physics domain TKM prototype, enriching its lexical resources and improving sentence parsing accuracy.
Findings
Enhanced lexical knowledge base with corpus annotation
Improved sentence parsing performance
Validated prototype with representative physics texts
Abstract
The aim of this paper is to evaluate the lexical components of a Text to Knowledge Mapping (TKM) prototype. The prototype is domain-specific, the purpose of which is to map instructional text onto a knowledge domain. The context of the knowledge domain of the prototype is physics, specifically DC electrical circuits. During development, the prototype has been tested with a limited data set from the domain. The prototype now reached a stage where it needs to be evaluated with a representative linguistic data set called corpus. A corpus is a collection of text drawn from typical sources which can be used as a test data set to evaluate NLP systems. As there is no available corpus for the domain, we developed a representative corpus and annotated it with linguistic information. The evaluation of the prototype considers one of its two main components- lexical knowledge base. With the corpus,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Advanced Text Analysis Techniques · Wikis in Education and Collaboration
