The Aloe Family Recipe for Open and Specialized Healthcare LLMs

Dario Garcia-Gasulla; Jordi Bayarri-Planas; Ashwin Kumar Gururajan; Enrique Lopez-Cuena; Adrian Tormos; Daniel Hinjos; Pablo Bernabeu-Perez; Anna Arias-Duart; Pablo Agustin Martin-Torres; Marta Gonzalez-Mallo; Sergio Alvarez-Napagao; Eduard Ayguad\'e-Parra; Ulises Cort\'es

arXiv:2505.04388·cs.CL·May 30, 2025

The Aloe Family Recipe for Open and Specialized Healthcare LLMs

Dario Garcia-Gasulla, Jordi Bayarri-Planas, Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Adrian Tormos, Daniel Hinjos, Pablo Bernabeu-Perez, Anna Arias-Duart, Pablo Agustin Martin-Torres, Marta Gonzalez-Mallo, Sergio Alvarez-Napagao, Eduard Ayguad\'e-Parra, Ulises Cort\'es

PDF

Open Access 4 Models 5 Datasets

TL;DR

This paper presents the Aloe Family of open-source healthcare LLMs, optimized through advanced data preprocessing, alignment, and evaluation methods, achieving competitive performance and safety standards for medical applications.

Contribution

It introduces a comprehensive recipe for developing open medical LLMs with improved safety, efficacy, and evaluation standards, and releases models under a permissive license.

Findings

01

Aloe models outperform many private counterparts on healthcare benchmarks.

02

Significant safety improvements, including resilience to jailbreaking attacks.

03

Models are preferred by healthcare professionals in evaluations.

Abstract

Purpose: With advancements in Large Language Models (LLMs) for healthcare, the need arises for competitive open-source models to protect the public interest. This work contributes to the field of open medical LLMs by optimizing key stages of data preprocessing and training, while showing how to improve model safety (through DPO) and efficacy (through RAG). The evaluation methodology used, which includes four different types of tests, defines a new standard for the field. The resultant models, shown to be competitive with the best private alternatives, are released with a permisive license. Methods: Building on top of strong base models like Llama 3.1 and Qwen 2.5, Aloe Beta uses a custom dataset to enhance public data with synthetic Chain of Thought examples. The models undergo alignment with Direct Preference Optimization, emphasizing ethical and policy-aligned performance in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Statistical and Computational Modeling · Machine Learning in Healthcare

MethodsBalanced Selection · LLaMA