Expert-sourcing Domain-specific Knowledge: The Case of Synonym Validation
Michael Unterkalmsteiner, Andrew Yates

TL;DR
This paper presents a tool-supported approach for sourcing domain-specific knowledge from experts to validate synonyms, aiming to improve data labeling quality in collaborative software engineering scenarios.
Contribution
It introduces a novel expert-sourcing method with design considerations to motivate expert participation in synonym validation tasks.
Findings
The approach effectively identifies true synonyms from candidate lists.
Design decisions help motivate experts to participate in labeling.
The method is adaptable to various data labeling tasks in software engineering.
Abstract
One prerequisite for supervised machine learning is high quality labelled data. Acquiring such data is, particularly if expert knowledge is required, costly or even impossible if the task needs to be performed by a single expert. In this paper, we illustrate tool support that we adopted and extended to source domain-specific knowledge from experts. We provide insight in design decisions that aim at motivating experts to dedicate their time at performing the labelling task. We are currently using the approach to identify true synonyms from a list of candidate synonyms. The identification of synonyms is important in scenarios were stakeholders from different companies and background need to collaborate, for example when defining and negotiating requirements. We foresee that the approach of expert-sourcing is applicable to any data labelling task in software engineering. The discussed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Biomedical Text Mining and Ontologies · Semantic Web and Ontologies
