Using Crowdsourcing to Identify a Proxy of Socio-Economic status
Adil E. Rajput, Akila Sarirete, Tamer F. Desouky

TL;DR
This paper explores using social media community language patterns, derived from discussion forums, as proxies for socio-economic status to aid health and urban service planning.
Contribution
It introduces a method to group users by geographic background through community vocabulary analysis, providing a new approach to estimate socio-economic status from online discussions.
Findings
Distinct vocabulary patterns correlate with different communities.
Language differences can predict socio-economic status.
Framework supports better targeting for health and urban services.
Abstract
Social Media provides researchers with an unprecedented opportunity to gain insight into various facets of human life. Health practitioners put a great emphasis on pinpointing socioeconomic status (SES) of individuals as they can use to it to predict certain diseases. Crowdsourcing is a term coined that entails gathering intelligence from a user community online. In order to group the users online into communities, researchers have made use of hashtags that will cull the interest of a community of users. In this paper, we propose a mechanism to group a certain group of users based on their geographic background and build a corpus for such users. Specifically, we have looked at discussion forums for some vehi-cles where the site has established communities for different areas to air their grievances or sing the praises of the vehicle. From such a discussion, it was pos-sible to glean the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
