A Survey of Spanish Clinical Language Models
Guillem Garc\'ia Subies, \'Alvaro Barbero Jim\'enez, Paloma Mart\'inez, Fern\'andez

TL;DR
This survey reviews Spanish clinical language models, benchmarking over 3000 models on clinical tasks, and provides publicly available resources for future research in the domain.
Contribution
It offers a comprehensive overview of Spanish clinical language models, benchmarks them systematically, and releases datasets and models for reproducibility and further development.
Findings
Identified top-performing Spanish clinical language models
Benchmark results highlight the most effective models for clinical tasks
Resources are publicly available for future research
Abstract
This survey focuses in encoder Language Models for solving tasks in the clinical domain in the Spanish language. We review the contributions of 17 corpora focused mainly in clinical tasks, then list the most relevant Spanish Language Models and Spanish Clinical Language models. We perform a thorough comparison of these models by benchmarking them over a curated subset of the available corpora, in order to find the best-performing ones; in total more than 3000 models were fine-tuned for this study. All the tested corpora and the best models are made publically available in an accessible way, so that the results can be reproduced by independent teams or challenged in the future when new Spanish Clinical Language models are created.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Interpreting and Communication in Healthcare
