A Comparison of Natural Language Understanding Platforms for Chatbots in Software Engineering
Ahmad Abdellatif, Khaled Badran, Diego Elias Costa, and Emad Shihab

TL;DR
This study compares four popular NLU platforms—IBM Watson, Google Dialogflow, Rasa, and Microsoft LUIS—for software engineering chatbots, evaluating their performance in intent classification, confidence scoring, and entity extraction using relevant datasets.
Contribution
The paper provides a comprehensive evaluation of leading NLU platforms specifically for software engineering chatbots, offering practical guidance for selection based on performance metrics.
Findings
IBM Watson excels in overall performance across tasks.
Rasa provides the highest confidence scores.
Microsoft LUIS and IBM Watson lead in entity extraction.
Abstract
Chatbots are envisioned to dramatically change the future of Software Engineering, allowing practitioners to chat and inquire about their software projects and interact with different services using natural language. At the heart of every chatbot is a Natural Language Understanding (NLU) component that enables the chatbot to understand natural language input. Recently, many NLU platforms were provided to serve as an off-the-shelf NLU component for chatbots, however, selecting the best NLU for Software Engineering chatbots remains an open challenge. Therefore, in this paper, we evaluate four of the most commonly used NLUs, namely IBM Watson, Google Dialogflow, Rasa, and Microsoft LUIS to shed light on which NLU should be used in Software Engineering based chatbots. Specifically, we examine the NLUs' performance in classifying intents, confidence scores stability, and extracting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
