A Novel, Human-in-the-Loop Computational Grounded Theory Framework for Big Social Data
Lama Alqazlan, Zheng Fang, Michael Castelle, Rob Procter

TL;DR
This paper introduces a human-in-the-loop computational grounded theory framework designed for analyzing large social datasets, combining qualitative rigor with scalable machine learning and NLP tools.
Contribution
It presents a novel methodological framework that integrates human oversight with computational methods for large-scale qualitative data analysis.
Findings
Framework effectively analyzes large Reddit datasets.
Maintains qualitative rigor with computational efficiency.
Provides researchers control over ML/NLP analysis processes.
Abstract
The availability of big data has significantly influenced the possibilities and methodological choices for conducting large-scale behavioural and social science research. In the context of qualitative data analysis, a major challenge is that conventional methods require intensive manual labour and are often impractical to apply to large datasets. One effective way to address this issue is by integrating emerging computational methods to overcome scalability limitations. However, a critical concern for researchers is the trustworthiness of results when Machine Learning (ML) and Natural Language Processing (NLP) tools are used to analyse such data. We argue that confidence in the credibility and robustness of results depends on adopting a 'human-in-the-loop' methodology that is able to provide researchers with control over the analytical process, while retaining the benefits of using ML…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Qualitative Research Methods and Applications · Data Analysis with R
