MaterioMiner -- An ontology-based text mining dataset for extraction of process-structure-property entities
Ali Riza Durmaz, Akhil Thomas, Lokesh Mishra, Rachana Niranjan Murthy,, Thomas Straub

TL;DR
MaterioMiner is a detailed dataset linking materials science ontologies with literature text, enabling training and benchmarking of neurosymbolic models for extracting complex process-structure-property relationships.
Contribution
The paper introduces the MaterioMiner dataset with fine-granular annotations linking ontologies to literature, supporting neurosymbolic model training and automated knowledge extraction in materials science.
Findings
High annotation consistency among raters
Successful fine-tuning of pre-trained models for entity recognition
Dataset facilitates materials language model benchmarking
Abstract
While large language models learn sound statistical representations of the language and information therein, ontologies are symbolic knowledge representations that can complement the former ideally. Research at this critical intersection relies on datasets that intertwine ontologies and text corpora to enable training and comprehensive benchmarking of neurosymbolic models. We present the MaterioMiner dataset and the linked materials mechanics ontology where ontological concepts from the mechanics of materials domain are associated with textual entities within the literature corpus. Another distinctive feature of the dataset is its eminently fine-granular annotation. Specifically, 179 distinct classes are manually annotated by three raters within four publications, amounting to a total of 2191 entities that were annotated and curated. Conceptual work is presented for the symbolic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
MethodsOntology
