Vocabulary growth in collaborative tagging systems
Ciro Cattuto, Andrea Baldassarri, Vito D. P. Servedio, Vittorio Loreto

TL;DR
This study analyzes the growth of tags in del.icio.us, revealing power-law behaviors in vocabulary expansion that suggest underlying cognitive processes in collaborative tagging systems.
Contribution
It provides a large-scale empirical analysis of vocabulary growth, uncovering universal power-law behaviors in both global and local tag usage over time.
Findings
Vocabulary size grows as a power-law with time.
Growth exponents are less than one, indicating sub-linear growth.
Universal patterns observed across different resources and contexts.
Abstract
We analyze a large-scale snapshot of del.icio.us and investigate how the number of different tags in the system grows as a function of a suitably defined notion of time. We study the temporal evolution of the global vocabulary size, i.e. the number of distinct tags in the entire system, as well as the evolution of local vocabularies, that is the growth of the number of distinct tags used in the context of a given resource or user. In both cases, we find power-law behaviors with exponents smaller than one. Surprisingly, the observed growth behaviors are remarkably regular throughout the entire history of the system and across very different resources being bookmarked. Similar sub-linear laws of growth have been observed in written text, and this qualitative universality calls for an explanation and points in the direction of non-trivial cognitive processes in the complex interaction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Semantic Web and Ontologies
