A Nationwide Japanese Medical Claims Foundation Model: Balancing Model Scaling and Task-Specific Computational Efficiency

Nanae Aratake; Taisei Tosaki; Yuji Okamoto; Eiichiro Uchino; Masaki Nakamura; Nobutomo Matsui; Akiko Hatakama; Yasushi Okuno

arXiv:2604.22348·cs.LG·April 27, 2026

A Nationwide Japanese Medical Claims Foundation Model: Balancing Model Scaling and Task-Specific Computational Efficiency

Nanae Aratake, Taisei Tosaki, Yuji Okamoto, Eiichiro Uchino, Masaki Nakamura, Nobutomo Matsui, Akiko Hatakama, Yasushi Okuno

PDF

TL;DR

This study investigates how model size affects performance in structured medical data tasks, finding that optimal size varies by task and balancing accuracy with computational efficiency.

Contribution

It provides empirical evidence on the relationship between model scale and downstream task performance in medical foundation models, highlighting task-dependent saturation points.

Findings

01

Disease prediction improves with larger models (32M-101M parameters).

02

Medication prediction saturates at 11M parameters, reducing training time.

03

Best models outperform baseline in precision-recall metrics.

Abstract

Clinical risk prediction using longitudinal medical data supports individualized care. Self-supervised foundation models have emerged as a promising approach for leveraging large-scale unlabeled healthcare records. In natural language processing, scaling laws suggest that larger models achieve predictably lower pretraining losses, supporting the foundation model paradigm. However, for structured medical data, characterized by a limited vocabulary and sparse observations, whether increasing model size consistently improves downstream predictions is unclear, as most studies evaluate only a single model scale. In this study, we evaluated the relationship between model scale and downstream task performance for structured medical foundation models. Using a random sample (2.3 million patients, 32 hospitals) from a nationwide 519-hospital Japanese claims database, we pretrained encoder-only…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.