Using natural language processing to extract health-related causality from Twitter messages
Son Doan, Elly W Yang, Sameer Tilak, Manabu Torii

TL;DR
This paper presents an NLP-based method to extract health-related causal relations from Twitter messages, focusing on stress, insomnia, and headache, using dependency parsing and pattern matching on a large dataset.
Contribution
It introduces a novel set of lexico-syntactic patterns for extracting causal relations from tweets, achieving high precision on a large-scale dataset.
Findings
Achieved average precision between 74.59% and 92.27%.
Revealed insights into health concerns expressed on Twitter.
Demonstrated effectiveness of dependency parser-based patterns.
Abstract
Twitter messages (tweets) contain various types of information, which include health-related information. Analysis of health-related tweets would help us understand health conditions and concerns encountered in our daily life. In this work, we evaluated an approach to extracting causal relations from tweets using natural language processing (NLP) techniques. We focused on three health-related topics: stress", "insomnia", and "headache". We proposed a set of lexico-syntactic patterns based on dependency parser outputs to extract causal information. A large dataset consisting of 24 million tweets were used. The results show that our approach achieved an average precision between 74.59% and 92.27%. Analysis of extracted relations revealed interesting findings about health-related in Twitter.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
