Using natural language processing to extract health-related causality   from Twitter messages

Son Doan; Elly W Yang; Sameer Tilak; Manabu Torii

arXiv:1911.06488·cs.CL·November 18, 2019

Using natural language processing to extract health-related causality from Twitter messages

Son Doan, Elly W Yang, Sameer Tilak, Manabu Torii

PDF

TL;DR

This paper presents an NLP-based method to extract health-related causal relations from Twitter messages, focusing on stress, insomnia, and headache, using dependency parsing and pattern matching on a large dataset.

Contribution

It introduces a novel set of lexico-syntactic patterns for extracting causal relations from tweets, achieving high precision on a large-scale dataset.

Findings

01

Achieved average precision between 74.59% and 92.27%.

02

Revealed insights into health concerns expressed on Twitter.

03

Demonstrated effectiveness of dependency parser-based patterns.

Abstract

Twitter messages (tweets) contain various types of information, which include health-related information. Analysis of health-related tweets would help us understand health conditions and concerns encountered in our daily life. In this work, we evaluated an approach to extracting causal relations from tweets using natural language processing (NLP) techniques. We focused on three health-related topics: stress", "insomnia", and "headache". We proposed a set of lexico-syntactic patterns based on dependency parser outputs to extract causal information. A large dataset consisting of 24 million tweets were used. The results show that our approach achieved an average precision between 74.59% and 92.27%. Analysis of extracted relations revealed interesting findings about health-related in Twitter.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.