A Reproducible Framework for Neural Topic Modeling in Focus Group Analysis
Heger Arfaoui, Mohammed Iheb Hergli, Beya Benzina, Slimane BenMiled

TL;DR
This paper introduces a reproducible framework for applying neural topic modeling to focus group data, demonstrating improved coherence and interpretability through systematic hyperparameter tuning and validation.
Contribution
It presents a systematic, reproducible approach for neural topic modeling in qualitative research, including hyperparameter exploration and multi-criteria evaluation for focus group analysis.
Findings
Transformer-based models outperform LDA in coherence for small datasets
Stability metrics and coherence can diverge, requiring multi-criteria evaluation
Reproducible framework and code are provided for future research
Abstract
Focus group discussions generate rich qualitative data but their analysis traditionally relies on labor-intensive manual coding that limits scalability and reproducibility. We present a systematic framework for applying BERTopic to focus group transcripts using data from ten focus groups exploring HPV vaccine perceptions in Tunisia (1,075 utterances). We conducted comprehensive hyperparameter exploration across 27 configurations, evaluating each through bootstrap stability analysis, performance metrics, and comparison with LDA baseline. Bootstrap analysis revealed that stability metrics (NMI and ARI) exhibited strong disagreement (r = -0.691) and showed divergent relationships with coherence, demonstrating that stability is multifaceted rather than monolithic. Our multi-criteria selection framework yielded a 7-topic model achieving 18\% higher coherence than optimized LDA (0.573 vs.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFocus Groups and Qualitative Methods · Computational and Text Analysis Methods · Mental Health via Writing
