Refinement of an Epilepsy Dictionary through Human Annotation of Health-related posts on Instagram
Aehong Min, Xuan Wang, Rion Brattig Correia, Jordan Rozum, Wendy R., Miller, Luis M. Rocha

TL;DR
This study refines an epilepsy-related biomedical dictionary by human and AI annotation of Instagram posts, improving the relevance of terms and the quality of knowledge networks for epilepsy research.
Contribution
The paper introduces a refined epilepsy dictionary based on human validation and AI comparison, enhancing term relevance and network analysis accuracy.
Findings
Refined dictionary reduces false positives in term matching.
Important terms in the network are more medically relevant after refinement.
GPT models underperform compared to human annotators in annotation tasks.
Abstract
We used a dictionary built from biomedical terminology extracted from various sources such as DrugBank, MedDRA, MedlinePlus, TCMGeneDIT, to tag more than 8 million Instagram posts by users who have mentioned an epilepsy-relevant drug at least once, between 2010 and early 2016. A random sample of 1,771 posts with 2,947 term matches was evaluated by human annotators to identify false-positives. OpenAI's GPT series models were compared against human annotation. Frequent terms with a high false-positive rate were removed from the dictionary. Analysis of the estimated false-positive rates of the annotated terms revealed 8 ambiguous terms (plus synonyms) used in Instagram posts, which were removed from the original dictionary. To study the effect of removing those terms, we constructed knowledge networks using the refined and the original dictionaries and performed an eigenvector-centrality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Wikis in Education and Collaboration · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Discriminative Fine-Tuning · Multi-Head Attention · Layer Normalization · Dense Connections · Attention Dropout · Weight Decay · Cosine Annealing · Dropout
