Assessing the efficacy of large language models in spinal health information dissemination

Aaron Lawson McLean; Christian Senft

PMC · DOI:10.1016/j.bas.2024.102815·April 12, 2024

Assessing the efficacy of large language models in spinal health information dissemination

Aaron Lawson McLean, Christian Senft

PDF

Open Access

TL;DR

This paper explores how AI can help spread spinal health information and calls for strict regulations and diverse evaluations.

Contribution

Highlights the need for multidisciplinary assessments and regulations for AI in spinal health education.

Findings

01

AI has potential in spinal health education.

02

Multidisciplinary evaluations are crucial for AI effectiveness.

03

Strict regulations are needed for AI in healthcare.

Abstract

•Examines AI's role in spinal health education.•Stresses multidisciplinary AI evaluations.•Advocates for stringent AI healthcare regulations. Examines AI's role in spinal health education. Stresses multidisciplinary AI evaluations. Advocates for stringent AI healthcare regulations.

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Diseases2

lumbar disc herniation AI

Keywords

Large language modelsHealthcare AIPatient educationSpinal healthRegulatory frameworksEthical implications

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Medical Imaging and Analysis · Radiomics and Machine Learning in Medical Imaging

Full text

The manuscript by Lang et al. titled "Are large language models valid tools for patient information on lumbar disc herniation? A spine surgeons' perspective” and published in the current issue of the journal, probes the effectiveness of large language models (LLMs) in conveying patient education on lumbar disc herniation, shedding light on the broader conversation surrounding the intersection of artificial intelligence (AI) and healthcare delivery (Lang et al., 2024). This inquiry is critical amid the rapid integration of generative AI technologies into patient education, underscored by the imperative for these technologies to produce outputs that are not only accurate and reliable but also comprehensible to the general public. The study focuses on evaluating LLMs – such as ChatGPT and Google Bard (now Gemini) – from the perspective of spine surgeons, analyzing the models' capacity for clarity, empathy, and accuracy in communication. While revealing insights into the potential role of LLMs in patient education, the research also exposes a complex web of methodological and conceptual challenges that invite a more detailed and nuanced conversation about AI's role in healthcare.

At the heart of this discourse lies the methodological approach adopted by Lang et al. which, while pioneering, underscores the inherent challenges in evaluating AI-driven tools within the medical field. The reliance on a homogenous group of spine surgeons as the sole evaluators introduces a layer of subjectivity and potentially narrows the scope of the study's applicability. This methodological design speaks to a broader issue within the field of healthcare AI research: the critical importance of multidisciplinary evaluations in assessing AI tools. The complex nature of healthcare, with its myriad stakeholders including patients, healthcare professionals, ethicists, and regulators, necessitates a comprehensive evaluative framework that encompasses diverse perspectives. Such an approach would not only enhance the validity and generalizability of research findings but also ensure that AI applications are aligned with the multifaceted needs and expectations of the healthcare ecosystem.

Moreover, the findings reported by Lang et al. characterized by a variability in evaluations and the presence of unsatisfactory responses, open a window into the ongoing debate over AI's reliability and accuracy in healthcare contexts. This variability is not merely a reflection of the subjective nature of human evaluations but also highlights the intrinsic limitations of current AI models. Despite significant advancements in AI technologies, the capacity of LLMs to understand and replicate the nuances of medical knowledge and patient communication remains imperfect. This observation is a microcosm of the broader challenges facing the integration of AI in healthcare: balancing the potential of AI to revolutionize patient education and care with the imperative for these technologies to be rigorously validated for clinical accuracy and safety.

The manuscript's discussion of regulatory considerations, notably the impending EU AI Act, serves as a segue into one of the most pressing issues in the field of healthcare AI: the development and implementation of regulatory frameworks that are capable of keeping pace with the rapid evolution of AI technologies. The regulation of AI in healthcare presents a complex puzzle, involving not only the technical aspects of AI performance and safety but also ethical considerations surrounding patient autonomy, privacy, and equity (Meskó and Topol, 2023). As AI models become increasingly integrated into healthcare delivery and patient education, the need for robust, dynamic regulatory frameworks that can adapt to technological advancements while safeguarding patient interests has never been more critical. This regulatory challenge is compounded by the global nature of AI development and deployment, calling for international collaboration in setting standards and guidelines that ensure the safe, ethical, and effective use of AI in healthcare.

Furthermore, the integration of AI into patient education, as explored by Lang et al. prompts a broader reflection on the future trajectory of AI in healthcare. The potential of AI to democratize access to medical knowledge, personalize patient education, and enhance the patient-care provider relationship is immense (Mittlelstadt, 2021). Yet, realizing this potential hinges not only on overcoming technical and regulatory hurdles but also on addressing fundamental ethical questions. How can AI be leveraged to enhance patient autonomy without supplanting the human element of healthcare? In what ways might AI exacerbate or mitigate existing disparities in healthcare access and outcomes? These questions underscore the need for a forward-looking approach to AI integration that is rooted in ethical principles, prioritizes patient-centered outcomes, and embraces the complexities of healthcare delivery.

In conclusion, the manuscript by Lang et al. represents a significant contribution to the burgeoning field of healthcare AI, providing valuable insights into the use of LLMs in patient education. However, the study also serves as a catalyst for a broader, more critical discourse on the future of AI in healthcare. As the field progresses, it is imperative that research methodologies evolve to reflect the multidisciplinary nature of healthcare, that regulatory frameworks adapt to the dynamic landscape of AI technologies, and that ethical considerations remain at the forefront of AI integration efforts. Only through a holistic, collaborative approach can the promise of AI in healthcare be fully realized, enhancing patient education, care, and outcomes in an era of technological transformation.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Bibliography3

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Lang S.Vitale J.Fekete T.Haschtmann D.Reitmeir R.Ropelato M.Are Large Language Models Valid Tools for Patient Information on Lumbar Disc Herniation? A Spine Surgeons' Perspective. Brain and Spine 2024(in press)10.1016/j.bas.2024.102804 PMC 1106700038706800 · doi ↗ · pubmed ↗
2MeskóB.Topol E.J.The imperative for regulatory oversight of large language models (or generative AI) in healthcare NPJ digital medicine 61202312010.1038/s 41746-023-00873-037414860 PMC 10326069 · doi ↗ · pubmed ↗
3Mittlelstadt B.The Impact of Artificial Intelligence on the Doctor-Patient Relationship. Report Commissioned by the Steering Committee for Human Rights in the Fields of Biomedcine and Health (CDBIO)2021 Council of Europehttps://rm.coe.int/inf-2022-5-report-impact-of-ai-on-doctor-patient-relations-e/1680 a 68859