How Reliable AI Chatbots are for Disease Prediction from Patient Complaints?
Ayesha Siddika Nipu, K M Sajjadul Islam, Praveen Madiraju

TL;DR
This study evaluates the reliability of AI chatbots like GPT 4.0, Claude 3 Opus, and Gemini Ultra 1.0 in predicting diseases from patient complaints, highlighting their potential and current limitations in healthcare decision-making.
Contribution
It provides a comparative analysis of multiple AI chatbots' performance in disease prediction, emphasizing the need for validation and human oversight in medical applications.
Findings
GPT 4.0 achieves high accuracy with more data
Gemini Ultra 1.0 performs well with fewer examples
BERT's performance is lower than chatbots
Abstract
Artificial Intelligence (AI) chatbots leveraging Large Language Models (LLMs) are gaining traction in healthcare for their potential to automate patient interactions and aid clinical decision-making. This study examines the reliability of AI chatbots, specifically GPT 4.0, Claude 3 Opus, and Gemini Ultra 1.0, in predicting diseases from patient complaints in the emergency department. The methodology includes few-shot learning techniques to evaluate the chatbots' effectiveness in disease prediction. We also fine-tune the transformer-based model BERT and compare its performance with the AI chatbots. Results suggest that GPT 4.0 achieves high accuracy with increased few-shot data, while Gemini Ultra 1.0 performs well with fewer examples, and Claude 3 Opus maintains consistent performance. BERT's performance, however, is lower than all the chatbots, indicating limitations due to limited…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece · Linear Warmup With Linear Decay · Cosine Annealing · Attention Dropout · Linear Layer · Multi-Head Attention · Residual Connection · Weight Decay
