An experiment exploring the theoretical and methodological challenges in developing a semi-automated approach to analysis of small-N qualitative data
Sandro Tsang

TL;DR
This study develops a semi-automated qualitative data analysis method combining text-mining and coding to analyze small-N transcripts efficiently, revealing more specific insights than traditional approaches.
Contribution
It introduces a novel semi-automated QDA algorithm that integrates text-mining with manual coding, improving efficiency and specificity in small sample qualitative analysis.
Findings
QDA retrieved more specific information than TM alone
The combined approach was completed in 6-7 days
The method enhances transparency and systematic analysis
Abstract
This paper experiments with designing a semi-automated qualitative data analysis (QDA) algorithm to analyse 20 transcripts by using freeware. Text-mining (TM) and QDA were guided by frequency and association measures, because these statistics remain robust when the sample size is small. The refined TM algorithm split the text into various sizes based on a manually revised dictionary. This lemmatisation approach may reflect the context of the text better than uniformly tokenising the text into one single size. TM results were used for initial coding. Code repacking was guided by association measures and external data to implement a general inductive QDA approach. The information retrieved by TM and QDA was depicted in subgraphs for comparisons. The analyses were completed in 6-7 days. Both algorithms retrieved contextually consistent and relevant information. However, the QDA algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Qualitative Research Methods and Applications · Data Analysis with R
