Artificial Intelligence (AI) in rheumatology: a comparative evaluation of the ChatGPT and DeepSeek application
Maria Polyzou, Xenofon Baraliakos

TL;DR
This paper compares ChatGPT and DeepSeek AI models in diagnosing and treating rheumatologic diseases like ankylosing spondylitis and psoriatic arthritis, finding their reliability moderate.
Contribution
The study introduces a comparative evaluation of ChatGPT and DeepSeek in rheumatology using statistical tests and clinical data from 116 patients.
Findings
ChatGPT and DeepSeek showed moderate validity and reliability in providing information for axSpA and PsA.
Statistical tests like Cohen’s Kappa and Fleiss’ Kappa indicated no strong agreement between AI responses and clinical data.
The study recommends that AI-generated information should be validated by doctors before use.
Abstract
The continuous increase in Artificial Intelligence (AI) applications in various areas of human life has brought about great changes in many sciences, among which is the health sector. ChatGPT and DeepSeek belong to the category of Large Language Models (LLMs) developed by Artificial Intelligence (AI) using supervised and reinforcement learning techniques. The aim of this article is to evaluate the accuracy and consistency of ChatGPT and DeepSeek models in the diagnosis and treatment of two rheumatologic diseases, ankylosing spondylitis (axSpA) and psoriatic arthritis (PsA). Both ChatGPT and the DeepSeek chat system have revolutionized information retrieval capabilities and are two of the fastest growing platforms. They are effective tools that produce text responses to human data with high accuracy, accessibility, and low cost, but their use has raised many questions about their…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Spondyloarthritis Studies and Treatments · Clinical Reasoning and Diagnostic Skills
