The Tag is the Signal: URL-Agnostic Credibility Scoring for Messages on Telegram
Yipeng Wang, Huy Gia Han Vu, Mohit Singhal

TL;DR
This paper introduces TAG2CRED, a novel credibility scoring pipeline for short, URL-sparse messages on Telegram, leveraging tag-based classification and fine-tuned large language models to outperform traditional methods.
Contribution
The paper presents a new tag-based credibility scoring method tailored for short Telegram messages, improving accuracy and generalization over existing URL-based and lexical feature approaches.
Findings
TAG2CRED achieves ROC-AUC of 0.871, outperforming TF-IDF baseline.
Ensemble model further improves ROC-AUC to 0.901.
Model uses fewer features and generalizes better to infrequent domains.
Abstract
Telegram has become one of the leading platforms for disseminating misinformational messages. However, many existing pipelines still classify each message's credibility based on the reputation of its associated domain names or its lexical features. Such methods work well on traditional long-form news articles published by well-known sources, but high-risk posts on Telegram are short and URL-sparse, leading to failures for link-based and standard TF-IDF models. To this end, we propose the TAG2CRED pipeline, a method designed for such short, convoluted messages. Our model will directly score each post based on the tags assigned to the text. We designed a concise label system that covers the dimensions of theme, claim type, call to action, and evidence. The fine-tuned large language model (LLM) assigns tags to messages and then maps these tags to calibrated risk scores in the [0,1]…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Hate Speech and Cyberbullying Detection
