A Nationwide Japanese Medical Claims Foundation Model: Balancing Model Scaling and Task-Specific Computational Efficiency
Nanae Aratake, Taisei Tosaki, Yuji Okamoto, Eiichiro Uchino, Masaki Nakamura, Nobutomo Matsui, Akiko Hatakama, Yasushi Okuno

TL;DR
This study investigates how model size affects performance in structured medical data tasks, finding that optimal size varies by task and balancing accuracy with computational efficiency.
Contribution
It provides empirical evidence on the relationship between model scale and downstream task performance in medical foundation models, highlighting task-dependent saturation points.
Findings
Disease prediction improves with larger models (32M-101M parameters).
Medication prediction saturates at 11M parameters, reducing training time.
Best models outperform baseline in precision-recall metrics.
Abstract
Clinical risk prediction using longitudinal medical data supports individualized care. Self-supervised foundation models have emerged as a promising approach for leveraging large-scale unlabeled healthcare records. In natural language processing, scaling laws suggest that larger models achieve predictably lower pretraining losses, supporting the foundation model paradigm. However, for structured medical data, characterized by a limited vocabulary and sparse observations, whether increasing model size consistently improves downstream predictions is unclear, as most studies evaluate only a single model scale. In this study, we evaluated the relationship between model scale and downstream task performance for structured medical foundation models. Using a random sample (2.3 million patients, 32 hospitals) from a nationwide 519-hospital Japanese claims database, we pretrained encoder-only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
