Letter to the Editor of the Journal of Medical Systems: Regarding “Responses of Five Different Artificial Intelligence Chatbots to the Top Searched Queries About Erectile Dysfunction: A Comparative Analysis”

Jakub Brzeziński; Robert Olszewski

PMC · DOI:10.1007/s10916-024-02082-y·July 5, 2024

Letter to the Editor of the Journal of Medical Systems: Regarding “Responses of Five Different Artificial Intelligence Chatbots to the Top Searched Queries About Erectile Dysfunction: A Comparative Analysis”

Jakub Brzeziński, Robert Olszewski

PDF

Open Access

Abstract

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Diseases1

Erectile Dysfunction

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Mental Health Interventions · Mobile Health and mHealth Applications · Artificial Intelligence in Healthcare and Education

Full text

A recently published article in the journal, which is both intriguing and highly futuristic, examines the quality and readability of responses provided by five different AI-based chatbots on erectile dysfunction. In the ‘Materials and Methods’ section, the authors detail their analysis of these responses using various readability scales, including DISCERN, Ensuring Quality Information for Patients (EQIP), and the Flesch-Kincaid Grade Level (FKGL) and Reading Ease (FKRE). Additionally, they disclose the names of the five AI-based chatbots used to generate the responses under analysis [1].

Upon reviewing the article, we noticed an error in the chatbots mentioned by the authors. In the initial sentence of the abstract and the second paragraph of the ‘Materials and methods’ section, the authors introduced five chatbots: ChatGPT, Bard, Bing, Ernie, and Copilot. However, the authors used the same chatbots, Microsoft Bing and Copilot, throughout their article. They also included links to the chatbots’ websites in the second paragraph. We were directed to the same Copilot chatbot page upon visiting the provided Microsoft Bing: https://www.bing.com/chat and Copilot links: https://copilot.microsoft.com. This indicates a potential oversight on the part of the authors.

During Microsoft Ignite 2023, held from November 15–17, 2023, it was announced that the Bing chatbot would rebrand to Copilot, effective December 1, 2023. Copilot enhances the existing Bing service environment to deliver a novel search experience. Like Bing, Copilot is powered by a cutting-edge large language model (LLM) - GPT-4 (Generative Pre-trained Transformer-4). [2].

Contrary to the article’s statement, the authors did not use five different chatbots but four. Interestingly, one of these chatbots underwent a name change during the analysis. The authors failed to provide the specific dates when the chatbots were queried and analyzed, leaving the rationale for counting one chatbot as two unclear. Recent research articles on the utilization of chatbots powered by large language models (LLMs) have noted that Bing has been rebranded as Copilot [3].

I am contacting you not to undermine the authors’ work but to rectify any inaccuracies. This article might misinform readers interested in utilizing chatbots powered by large language models (LLMs). The authors propose using five distinct chatbots, when in reality, they employ four, as two are identical.

Bibliography1

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1https://www.microsoft.com/en-us/bing?ep=278&form=MA 13LT&es=31 (Access: 11.04.2024)