A Systematic Analysis of Declining Medical Safety Messaging in Generative AI Models
Sonali Sharma, Ahmed M. Alaa, Roxana Daneshjou

TL;DR
This study systematically analyzed the decline of medical safety disclaimers in generative AI models from 2022 to 2025, highlighting a significant reduction in safety measures as models become more capable.
Contribution
It provides the first large-scale, longitudinal analysis of disclaimer presence in medical AI outputs, emphasizing the need for safety safeguards in evolving models.
Findings
Disclaimers in LLM outputs dropped from 26.3% to 0.97% between 2022 and 2025.
Disclaimers in VLM outputs decreased from 19.6% to 1.05% over the same period.
Most models by 2025 lacked safety disclaimers, raising concerns about clinical safety.
Abstract
Generative AI models, including large language models (LLMs) and vision-language models (VLMs), are increasingly used to interpret medical images and answer clinical questions. Their responses often include inaccuracies; therefore, safety measures like medical disclaimers are critical to remind users that AI outputs are not professionally vetted or a substitute for medical advice. This study evaluated the presence of disclaimers in LLM and VLM outputs across model generations from 2022 to 2025. Using 500 mammograms, 500 chest X-rays, 500 dermatology images, and 500 medical questions, outputs were screened for disclaimer phrases. Medical disclaimer presence in LLM and VLM outputs dropped from 26.3% in 2022 to 0.97% in 2025, and from 19.6% in 2023 to 1.05% in 2025, respectively. By 2025, the majority of models displayed no disclaimers. As public models become more capable and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
