Bridging Textual Data and Conceptual Models: A Model-Agnostic Structuring Approach
Jacques Chabin (LIFO, Pamda), Mirian Halfeld Ferrari (LIFO, Pamda), Nicolas Hiot (LIFO, Pamda)

TL;DR
This paper presents an automated, model-agnostic approach for converting textual data into structured schemas and instances, using semantic trees and iterative refinement, demonstrated on clinical medical data.
Contribution
It introduces a novel method combining semantic syntax trees and attribute grammar refinement for flexible data structuring across models.
Findings
Effective structuring of clinical data demonstrated
Schema and instance generation achieved
Applicable to diverse database models
Abstract
We introduce an automated method for structuring textual data into a model-agnostic schema, enabling alignment with any database model. It generates both a schema and its instance. Initially, textual data is represented as semantically enriched syntax trees, which are then refined through iterative tree rewriting and grammar extraction, guided by the attribute grammar meta-model \metaG. The applicability of this approach is demonstrated using clinical medical cases as a proof of concept.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Semantic Web and Ontologies · Model-Driven Software Engineering Techniques
