Unsupervised Topic Discovery in User Comments
Christoph Stanik, Tim Pietz, Walid Maalej

TL;DR
This paper presents an unsupervised deep learning approach for automatically discovering semantically coherent topics in user comments, aiding stakeholders in extracting valuable insights without manual effort.
Contribution
It introduces a novel deep NLP-based method for unsupervised topic discovery in user comments that requires no parameter tuning and demonstrates high cluster cohesion and meaningfulness.
Findings
High inter-coder agreement (up to 98%) in evaluation
Effective thematic analysis on telecommunication tweets
Robustness of approach without parameter configuration
Abstract
On social media platforms like Twitter, users regularly share their opinions and comments with software vendors and service providers. Popular software products might get thousands of user comments per day. Research has shown that such comments contain valuable information for stakeholders, such as feature ideas, problem reports, or support inquiries. However, it is hard to manually manage and grasp a large amount of user comments, which can be redundant and of a different quality. Consequently, researchers suggested automated approaches to extract valuable comments, e.g., through problem report classifiers. However, these approaches do not aggregate semantically similar comments into specific aspects to provide insights like how often users reported a certain problem. We introduce an approach for automatically discovering topics composed of semantically similar user comments based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
