Dream Content Discovery from Reddit with an Unsupervised Mixed-Method Approach
Anubhab Das, Sanja \v{S}\'cepanovi\'c, Luca Maria Aiello, Remington Mallett, Deirdre Barrett, and Daniele Quercia

TL;DR
This paper introduces a novel unsupervised mixed-method NLP approach to analyze large-scale dream reports from Reddit, identifying extensive themes and patterns, and tracking changes over time and major events.
Contribution
It presents the largest collection of dream topics to date and demonstrates a data-driven method that surpasses traditional scales in analyzing dream content.
Findings
Identified 217 topics grouped into 22 themes from 44,213 reports
Validated topics against Hall and van de Castle scale
Detected changes in dream themes during major events like COVID-19 and war
Abstract
Dreaming is a fundamental but not fully understood part of human experience that can shed light on our thought patterns. Traditional dream analysis practices, while popular and aided by over 130 unique scales and rating systems, have limitations. Mostly based on retrospective surveys or lab studies, they struggle to be applied on a large scale or to show the importance and connections between different dream themes. To overcome these issues, we developed a new, data-driven mixed-method approach for identifying topics in free-form dream reports through natural language processing. We tested this method on 44,213 dream reports from Reddit's r/Dreams subreddit, where we found 217 topics, grouped into 22 larger themes: the most extensive collection of dream topics to date. We validated our topics by comparing it to the widely-used Hall and van de Castle scale. Going beyond traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSleep and Wakefulness Research
