CEAR: Automatic construction of a knowledge graph of chemical entities and roles from scientific literature
Stefan Langer, Fabian Neuhaus, Andreas N\"urnberger

TL;DR
This paper presents CEAR, a method that combines ontological data and large language models to automatically extract chemical entities and roles from scientific literature, creating a comprehensive knowledge graph to enhance existing resources.
Contribution
The paper introduces a novel approach that integrates ontological knowledge with LLMs to automatically construct a chemical knowledge graph from scientific texts.
Findings
High precision and recall in identifying chemical entities and roles
Successful extraction from 8,000 ChemRxiv articles
Extension of existing chemical ontologies with literature-derived data
Abstract
Ontologies are formal representations of knowledge in specific domains that provide a structured framework for organizing and understanding complex information. Creating ontologies, however, is a complex and time-consuming endeavor. ChEBI is a well-known ontology in the field of chemistry, which provides a comprehensive resource for defining chemical entities and their properties. However, it covers only a small fraction of the rapidly growing knowledge in chemistry and does not provide references to the scientific literature. To address this, we propose a methodology that involves augmenting existing annotated text corpora with knowledge from Chebi and fine-tuning a large language model (LLM) to recognize chemical entities and their roles in scientific text. Our experiments demonstrate the effectiveness of our approach. By combining ontological knowledge and the language understanding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Semantic Web and Ontologies · History and advancements in chemistry
MethodsSparse Evolutionary Training · Ontology
