Disparities in Multilingual LLM-Based Healthcare Q&A
Ipek Baris Schlicht, Burcu Sayin, Zhixue Zhao, Frederik M. Labont\'e, Cesare Barbera, Marco Viviani, Paolo Rosso, Lucie Flek

TL;DR
This paper investigates disparities in multilingual healthcare Q&A using LLMs, revealing significant cross-lingual differences in information coverage and factual accuracy, and explores methods to improve equitable knowledge alignment.
Contribution
It introduces a multilingual healthcare dataset, analyzes cross-lingual disparities in LLM responses, and demonstrates how contextual information can enhance factual alignment across languages.
Findings
LLMs favor English Wikipedia responses over other languages.
Providing contextual excerpts improves factual alignment with culturally relevant knowledge.
Significant disparities exist in healthcare information coverage across languages.
Abstract
Equitable access to reliable health information is vital when integrating AI into healthcare. Yet, information quality varies across languages, raising concerns about the reliability and consistency of multilingual Large Language Models (LLMs). We systematically examine cross-lingual disparities in pre-training source and factuality alignment in LLM answers for multilingual healthcare Q&A across English, German, Turkish, Chinese (Mandarin), and Italian. We (i) constructed Multilingual Wiki Health Care (MultiWikiHealthCare), a multilingual dataset from Wikipedia; (ii) analyzed cross-lingual healthcare coverage; (iii) assessed LLM response alignment with these references; and (iv) conducted a case study on factual alignment through the use of contextual information and Retrieval-Augmented Generation (RAG). Our findings reveal substantial cross-lingual disparities in both Wikipedia…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Health Literacy and Information Accessibility · Topic Modeling
