An Analysis of Twitter Users From The Perspective of Their Behavior, Language, Region and Development Indices -- A Study of 80 Million Tweets
Shahab Saquib Sohail, Mohammad Muzammil Khan, M. Afshar Alam

TL;DR
This study analyzes over 86 million tweets to explore user behavior, language, regional activity, and their relation to human development indices, providing insights for diverse research domains.
Contribution
It presents a comprehensive methodology for data crawling and feature extraction from Twitter, along with an analysis linking tweet activity to country development levels.
Findings
High tweet activity in countries with high human development indices
Identification of dominant languages and frequent words in tweets
Correlation between regional tweet volume and socio-economic factors
Abstract
The need for a comprehensive study to explore various aspects of online social media has been instigated by many researchers. This paper gives an insight into the social platform, Twitter. In this present work, we have illustrated stepwise procedure for crawling the data and discuss the key issues related to extracting associated features that can be useful in Twitter-related research while crawling these data from Application Programming Interfaces (APIs). Further, the data that comprises of over 86 million tweets have been analysed from various perspective including the most used languages, most frequent words, most frequent users, countries with most and least tweets and re-tweets, etc. The analysis reveals that the users' data associated with Twitter has a high affinity for researches in the various domain that includes politics, social science, economics, and linguistics, etc. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Sentiment Analysis and Opinion Mining · Spam and Phishing Detection
