To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders
Ufuk Arzu, Batuhan Gencer

TL;DR
This study evaluates how reliable and understandable ChatGPT 4.0's responses are for common musculoskeletal issues, finding mostly adequate advice but highlighting risks in trauma-related cases.
Contribution
The study introduces a novel evaluation of ChatGPT 4.0's diagnostic and advisory effectiveness for musculoskeletal disorders using readability and expert scoring.
Findings
Most ChatGPT 4.0 responses were rated as 'adequate' or 'excellent' for musculoskeletal disorders.
Trauma-related questions showed significant variability in diagnostic accuracy based on area of interest.
ChatGPT 4.0's responses were readable at a high school level, but caution is needed for trauma advice.
Abstract
Background/Objectives: The increased accessibility of information has resulted in a rise in patients trying to self-diagnose and opting for self-medication, either as a primary treatment or as a supplement to medical care. Our objective was to evaluate the reliability, comprehensibility, and readability of the responses provided by ChatGPT 4.0 when queried about the most prevalent orthopaedic problems, thus ascertaining the occurrence of misguidance and the necessity for an audit of the disseminated information. Methods: ChatGPT 4.0 was presented with 26 open-ended questions. The responses were evaluated by two observers using a Likert scale in the categories of diagnosis, recommendation, and referral. The scores from the responses were subjected to subgroup analysis according to the area of interest (AoI) and anatomical region. The readability and comprehensibility of the chatbot’s…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Clinical Reasoning and Diagnostic Skills · Heart Failure Treatment and Management
