Creating a Fine Grained Entity Type Taxonomy Using LLMs
Michael Gunn, Dohyun Park, Nidhish Kamath

TL;DR
This paper demonstrates how GPT-4 and GPT-4 Turbo can autonomously generate a detailed, extensive entity type taxonomy with over 5000 types, improving information extraction tasks.
Contribution
It introduces a novel iterative prompting method for large language models to create fine-grained entity taxonomies automatically.
Findings
Generated taxonomy contains over 5000 nuanced entity types.
The taxonomy significantly improves relation and event argument extraction.
The approach is effective and adaptable for expanding entity classifications.
Abstract
In this study, we investigate the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy. Our objective is to construct a comprehensive taxonomy, starting from a broad classification of entity types - including objects, time, locations, organizations, events, actions, and subjects - similar to existing manually curated taxonomies. This classification is then progressively refined through iterative prompting techniques, leveraging GPT-4's internal knowledge base. The result is an extensive taxonomy comprising over 5000 nuanced entity types, which demonstrates remarkable quality upon subjective evaluation. We employed a straightforward yet effective prompting strategy, enabling the taxonomy to be dynamically expanded. The practical applications of this detailed taxonomy are diverse and significant. It facilitates the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Adam · Softmax · Multi-Head Attention · Layer Normalization · Residual Connection · Absolute Position Encodings
