HICL: Hashtag-Driven In-Context Learning for Social Media Natural Language Understanding
Hanzhuo Tan, Chunpu Xu, Jing Li, Yuqun Zhang, Zeyang Fang, Zeyu Chen,, Baohua Lai

TL;DR
This paper introduces HICL, a hashtag-driven in-context learning framework that enhances social media NLU by leveraging hashtag-based pre-training and retrieval of topic-related posts, significantly improving performance on multiple tasks.
Contribution
The paper proposes a novel hashtag-driven pre-training method and retrieval-based context enrichment for social media NLU, demonstrating substantial improvements over previous models.
Findings
Retrieving top posts with hashtags improves NLU performance.
Trigger words effectively fuse source and retrieved contexts.
HICL outperforms previous state-of-the-art on seven downstream tasks.
Abstract
Natural language understanding (NLU) is integral to various social media applications. However, existing NLU models rely heavily on context for semantic learning, resulting in compromised performance when faced with short and noisy social media content. To address this issue, we leverage in-context learning (ICL), wherein language models learn to make inferences by conditioning on a handful of demonstrations to enrich the context and propose a novel hashtag-driven in-context learning (HICL) framework. Concretely, we pre-train a model #Encoder, which employs #hashtags (user-annotated topic labels) to drive BERT-based pre-training through contrastive learning. Our objective here is to enable #Encoder to gain the ability to incorporate topic-related semantic information, which allows it to retrieve topic-related posts to enrich contexts and enhance social media NLU with noisy contexts. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Text and Document Classification Technologies
