Common-Knowledge Concept Recognition for SEVA
Jitin Krishnan, Patrick Coronado, Hemant Purohit, and Huzefa Rangwala

TL;DR
This paper presents a system for recognizing systems engineering concepts in text, leveraging a fine-tuned language model and domain-specific datasets to enable downstream tasks like knowledge graph construction.
Contribution
It introduces a novel annotated dataset and a fine-tuning approach for concept recognition in systems engineering texts, facilitating knowledge extraction for SEVA.
Findings
Achieved effective concept recognition with the fine-tuned model.
Constructed a knowledge graph from extracted concepts.
Developed domain-specific datasets for abbreviations and definitions.
Abstract
We build a common-knowledge concept recognition system for a Systems Engineer's Virtual Assistant (SEVA) which can be used for downstream tasks such as relation extraction, knowledge graph construction, and question-answering. The problem is formulated as a token classification task similar to named entity extraction. With the help of a domain expert and text processing methods, we construct a dataset annotated at the word-level by carefully defining a labelling scheme to train a sequence model to recognize systems engineering concepts. We use a pre-trained language model and fine-tune it with the labeled dataset of concepts. In addition, we also create some essential datasets for information such as abbreviations and definitions from the systems engineering domain. Finally, we construct a simple knowledge graph using these extracted concepts along with some hyponym relations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Web Data Mining and Analysis
