Large Language Models for Scholarly Ontology Generation: An Extensive Analysis in the Engineering Field
Tanay Aggarwal, Angelo Salatino, Francesco Osborne, Enrico Motta

TL;DR
This paper evaluates large language models' ability to identify semantic research topic relationships for automated ontology generation, highlighting that smaller, optimized models can perform comparably to larger ones, thus enabling efficient scientific knowledge structuring.
Contribution
The study provides a comprehensive evaluation of 17 LLMs for research topic relationship identification, introducing a benchmark based on IEEE Thesaurus and demonstrating the effectiveness of smaller, optimized models.
Findings
Several models achieved high F1-scores, with Claude 3 Sonnet reaching 0.967.
Smaller, quantised models can match larger models' performance with proper prompt engineering.
Optimized small models require less computational resources while maintaining accuracy.
Abstract
Ontologies of research topics are crucial for structuring scientific knowledge, enabling scientists to navigate vast amounts of research, and forming the backbone of intelligent systems such as search engines and recommendation systems. However, manual creation of these ontologies is expensive, slow, and often results in outdated and overly general representations. As a solution, researchers have been investigating ways to automate or semi-automate the process of generating these ontologies. This paper offers a comprehensive analysis of the ability of large language models (LLMs) to identify semantic relationships between different research topics, which is a critical step in the development of such ontologies. To this end, we developed a gold standard based on the IEEE Thesaurus to evaluate the task of identifying four types of relationships between pairs of topics: broader, narrower,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
