Matching Table Metadata with Business Glossaries Using Large Language Models
Elita Lobo, Oktie Hassanzadeh, Nhan Pham, Nandana Mihindukulasooriya,, Dharmashankar Subramanian, Horst Samulowitz

TL;DR
This paper explores using large language models to match database table metadata with business glossaries, enabling better data retrieval and analysis without extensive manual tuning or access to data contents.
Contribution
It introduces LLM-based methods for matching metadata to glossaries that handle complex descriptions without manual tuning, improving over traditional similarity measures.
Findings
LLM-based methods effectively match complex descriptions.
Generated context improves matching accuracy.
Preliminary results show promising effectiveness.
Abstract
Enterprises often own large collections of structured data in the form of large databases or an enterprise data lake. Such data collections come with limited metadata and strict access policies that could limit access to the data contents and, therefore, limit the application of classic retrieval and analysis solutions. As a result, there is a need for solutions that can effectively utilize the available metadata. In this paper, we study the problem of matching table metadata to a business glossary containing data labels and descriptions. The resulting matching enables the use of an available or curated business glossary for retrieval and analysis without or before requesting access to the data contents. One solution to this problem is to use manually-defined rules or similarity measures on column names and glossary descriptions (or their vector embeddings) to find the closest match.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Semantic Web and Ontologies · Topic Modeling
