Hashtag-Guided Low-Resource Tweet Classification
Shizhe Diao, Sedrick Scott Keh, Liangming Pan, Zhiliang Tian, Yan, Song, Tong Zhang

TL;DR
This paper introduces HashTation, a model that automatically generates hashtags to enrich short, ambiguous tweets, significantly improving low-resource tweet classification tasks with limited labeled data.
Contribution
Proposes a novel hashtag-guided classification model that generates meaningful hashtags to enhance tweet classification in low-resource settings.
Findings
HashTation improves accuracy on seven low-resource tweet classification tasks.
Automatically generated hashtags are consistent with tweet content and labels.
Enriching tweets with model-generated hashtags reduces the need for large labeled datasets.
Abstract
Social media classification tasks (e.g., tweet sentiment analysis, tweet stance detection) are challenging because social media posts are typically short, informal, and ambiguous. Thus, training on tweets is challenging and demands large-scale human-annotated labels, which are time-consuming and costly to obtain. In this paper, we find that providing hashtags to social media tweets can help alleviate this issue because hashtags can enrich short and ambiguous tweets in terms of various information, such as topic, sentiment, and stance. This motivates us to propose a novel Hashtag-guided Tweet Classification model (HashTation), which automatically generates meaningful hashtags for the input tweet to provide useful auxiliary signals for tweet classification. To generate high-quality and insightful hashtags, our hashtag generation model retrieves and encodes the post-level and entity-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Text and Document Classification Technologies · Topic Modeling
