The Lokahi Prototype: Toward the automatic Extraction of Entity Relationship Models from Text
Michael Kaufmann

TL;DR
The paper introduces the Lokahi prototype, which automatically extracts entities and relationships from text using statistical measures, aiming to generate semantic data models.
Contribution
It presents a prototype that uses TF*IDF and co-occurrence statistics for entity and relationship extraction, advancing automatic semantic data modeling from text.
Findings
Entities extracted using TF*IDF measure.
Relationships generated based on co-occurrence statistics.
Provides insights and outlines future research directions.
Abstract
Entity relationship extraction envisions the automatic generation of semantic data models from collections of text, by automatic recognition of entities, by association of entities to form relationships, and by classifying these instances to assign them to entity sets (or classes) and relationship sets (or associations). As a first step in this direction, the Lokahi prototype can extract entities based on the TF*IDF measure, and generate semantic relationships based on document-level co-occurrence statistics, for example with likelihood ratios and pointwise mutual information. This paper presents results of an explorative, prototypical, qualitative and synthetic research, summarizes insights from two research projects and, based on this, indicates an outline for further research in the field of entity relationship extraction from text.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
