EvoTaxo: Building and Evolving Taxonomy from Social Media Streams
Yiyang Li, Tianyi Ma, Yanfang Ye

TL;DR
EvoTaxo is a novel framework that dynamically constructs and updates social media taxonomies by integrating semantic and temporal information, effectively capturing evolving discourse patterns.
Contribution
It introduces a LLM-based method that converts social media posts into structured edits, consolidates evidence over time, and maintains semantic boundaries, advancing taxonomy induction from dynamic social media streams.
Findings
Produces more balanced taxonomies than baselines
Achieves better coverage and structural quality
Captures meaningful temporal shifts in discourse
Abstract
Constructing taxonomies from social media corpora is challenging because posts are short, noisy, semantically entangled, and temporally dynamic. Existing taxonomy induction methods are largely designed for static corpora and often struggle to balance robustness, scalability, and sensitivity to evolving discourse. We propose EvoTaxo, a LLM-based framework for building and evolving taxonomies from temporally ordered social media streams. Rather than clustering raw posts directly, EvoTaxo converts each post into a structured draft action over the current taxonomy, accumulates structural evidence over time windows, and consolidates candidate edits through dual-view clustering that combines semantic similarity with temporal locality. A refinement-and-arbitration procedure then selects reliable edits before execution, while each node maintains a concept memory bank to preserve semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Misinformation and Its Impacts
