Social Media Analysis based on Semanticity of Streaming and Batch Data
Barathi Ganesh HB

TL;DR
This paper presents a method for extracting semantic information from streaming and batch social media data to improve Named Entity Recognition and author profiling, focusing on sociolect aspects like gender and age.
Contribution
It introduces a novel approach combining Conditional Random Fields with context analysis for sociolect profiling from social media micro posts.
Findings
Effective entity recognition using CRF in micro posts
Novel sociolect profiling method based on micro post context
Improved understanding of language variation across demographics
Abstract
Languages shared by people differ in different regions based on their accents, pronunciation and word usages. In this era sharing of language takes place mainly through social media and blogs. Every second swing of such a micro posts exist which induces the need of processing those micro posts, in-order to extract knowledge out of it. Knowledge extraction differs with respect to the application in which the research on cognitive science fed the necessities for the same. This work further moves forward such a research by extracting semantic information of streaming and batch data in applications like Named Entity Recognition and Author Profiling. In the case of Named Entity Recognition context of a single micro post has been utilized and context that lies in the pool of micro posts were utilized to identify the sociolect aspects of the author of those micro posts. In this work…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Spam and Phishing Detection · Authorship Attribution and Profiling
