SoftTiger: A Clinical Foundation Model for Healthcare Workflows
Ye Chen, Igor Couto, Wei Cai, Cong Fu, Bruno Dorneles

TL;DR
SoftTiger is a specialized large language model designed for healthcare workflows, capable of structuring clinical notes and supporting various clinical tasks, aiming to advance healthcare digitalization.
Contribution
The paper introduces SoftTiger, a novel clinical LLM trained on annotated clinical data, supporting multiple clinical tasks and addressing healthcare-specific modeling challenges.
Findings
SoftTiger outperforms popular open-source models and GPT-3.5 in evaluations.
Supports basic to complex clinical tasks, including abbreviation expansion and temporal info extraction.
Publicly released models and datasets to aid healthcare AI development.
Abstract
We introduce SoftTiger, a clinical large language model (CLaM) designed as a foundation model for healthcare workflows. The narrative and unstructured nature of clinical notes is a major obstacle for healthcare intelligentization. We address a critical problem of structuring clinical notes into clinical data, according to international interoperability standards. We collect and annotate data for three subtasks, namely, international patient summary, clinical impression and medical encounter. We then supervised fine-tuned a state-of-the-art LLM using public and credentialed clinical data. The training is orchestrated in a way that the target model can first support basic clinical tasks such as abbreviation expansion and temporal information extraction, and then learn to perform more complex downstream clinical tasks. Moreover, we address several modeling challenges in the healthcare…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Scientific Computing and Data Management · Electronic Health Records Systems
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Label Smoothing · Absolute Position Encodings · Linear Layer · Position-Wise Feed-Forward Layer · Transformer · {Dispute@FaQ-s}How to file a dispute with Expedia? · Layer Normalization
