Identifying Experts in Question & Answer Portals: A Case Study on Data   Science Competencies in Reddit

Sofia Strukova; Jos\'e A. Ruip\'erez-Valiente; F\'elix G\'omez; M\'armol

arXiv:2204.04098·cs.CY·June 28, 2023

Identifying Experts in Question & Answer Portals: A Case Study on Data Science Competencies in Reddit

Sofia Strukova, Jos\'e A. Ruip\'erez-Valiente, F\'elix G\'omez, M\'armol

PDF

Open Access

TL;DR

This paper presents a semi-supervised method for identifying data science experts, non-experts, and out-of-scope comments on Reddit, utilizing NLP and user features, with a novel coding approach and user classification.

Contribution

It introduces a novel manual coding scheme including out-of-scope comments and a semi-supervised model combining labeled and unlabeled data for expert identification.

Findings

01

NLP and user features are most effective for classification

02

Model generalizes well within the domain

03

Different user types in Reddit are identified

Abstract

The irreplaceable key to the triumph of Question & Answer (Q&A) platforms is their users providing high-quality answers to the challenging questions posted across various topics of interest. From more than a decade, the expert finding problem attracted much attention in information retrieval research. Based on the encountered gaps in the expert identification across several Q&A portals, we inspect the feasibility of identifying data science experts in Reddit. Our method is based on the manual coding results where two data science experts labelled not only expert and non-expert comments, but also out-of-scope comments, which is a novel contribution to the literature, enabling the identification of more groups of comments across web portals. We present a semi-supervised approach which combines 1,113 labelled comments with 100,226 unlabelled comments during training. The proposed model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExpert finding and Q&A systems · Topic Modeling