Analyzing Tag Distributions in Folksonomies for Resource Classification
Arkaitz Zubiaga, Raquel Mart\'inez, V\'ictor Fresno

TL;DR
This paper investigates how different social tagging system settings influence folksonomy tag distributions and their impact on resource classification, highlighting the importance of system configurations like tag suggestions.
Contribution
It provides an in-depth analysis of tag distributions across large datasets and examines how system settings affect classification performance and tag weighting schemes.
Findings
Tag suggestions significantly alter folksonomy structures.
Settings influence the effectiveness of TF-IDF weighting schemes.
Understanding system configurations aids in optimizing resource classification.
Abstract
Recent research has shown the usefulness of social tags as a data source to feed resource classification. Little is known about the effect of settings on folksonomies created on social tagging systems. In this work, we consider the settings of social tagging systems to further understand tag distributions in folksonomies. We analyze in depth the tag distributions on three large-scale social tagging datasets, and analyze the effect on a resource classification task. To this end, we study the appropriateness of applying weighting schemes based on the well-known TF-IDF for resource classification. We show the great importance of settings as to altering tag distributions. Among those settings, tag suggestions produce very different folksonomies, which condition the success of the employed weighting schemes. Our findings and analyses are relevant for researchers studying tag-based resource…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Recommender Systems and Techniques · Web Data Mining and Analysis
