TAGIFY: LLM-powered Tagging Interface for Improved Data Findability on OGD portals
Kevin Kliimask, Anastasija Nikiforova

TL;DR
This paper introduces Tagify, an LLM-powered interface designed to automate dataset tagging on OGD portals, enhancing data findability and accessibility by generating relevant tags in multiple languages.
Contribution
The paper presents a novel LLM-based tagging interface that automates dataset tagging on OGD portals, addressing current challenges in data discoverability.
Findings
User feedback indicates improved tagging efficiency.
Automated tags increase dataset discoverability.
Multilingual tagging supports diverse user needs.
Abstract
Efforts directed towards promoting Open Government Data (OGD) have gained significant traction across various governmental tiers since the mid-2000s. As more datasets are published on OGD portals, finding specific data becomes harder, leading to information overload. Complete and accurate documentation of datasets, including association of proper tags with datasets is key to improving dataset findability and accessibility. Analysis conducted on the Estonian Open Data Portal, revealed that 11% datasets have no associated tags, while 26% had only one tag assigned to them, which underscores challenges in data findability and accessibility within the portal, which, according to the recent Open Data Maturity Report, is considered trend-setter. The aim of this study is to propose an automated solution to tagging datasets to improve data findability on OGD portals. This paper presents Tagify -…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Data Quality and Management
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Label Smoothing · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Absolute Position Encodings · Transformer · Adam
