An Approach for Auto Generation of Labeling Functions for Software Engineering Chatbots
Ebube Alor, Ahmad Abdellatif, SayedHassan Khatoonabadi, Emad Shihab

TL;DR
This paper presents a method to automatically generate labeling functions for training software engineering chatbots, reducing manual effort and improving NLU performance on specialized datasets.
Contribution
The authors introduce an automated approach to generate labeling functions from labeled queries, enhancing data labeling efficiency for SE chatbots.
Findings
Generated LFs achieve up to 85.3% AUC in labeling accuracy.
NLU performance improves by up to 27.2% with generated LFs.
Number of LFs influences labeling effectiveness.
Abstract
Software engineering (SE) chatbots are increasingly gaining attention for their role in enhancing development processes. At the core of chatbots are Natural Language Understanding platforms (NLUs), which enable them to comprehend user queries but require labeled data for training. However, acquiring such labeled data for SE chatbots is challenging due to the scarcity of high-quality datasets, as training requires specialized vocabulary and phrases not found in typical language datasets. Consequently, developers often resort to manually annotating user queries -- a time-consuming and resource-intensive process. Previous approaches require human intervention to generate rules, called labeling functions (LFs), that categorize queries based on specific patterns. To address this issue, we propose an approach to automatically generate LFs by extracting patterns from labeled user queries. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Topic Modeling
MethodsSoftmax · Attention Is All You Need · Focus
