Rethinking Test-Time Scaling for Medical AI: Model and Task-Aware Strategies for LLMs and VLMs
Gyutaek Oh, Seoyeon Kim, Sangjoon Park, and Byung-Hoon Kim

TL;DR
This paper investigates test-time scaling strategies for large language and vision-language models in medical AI, analyzing their effectiveness, robustness, and providing practical guidelines for improving model reliability and interpretability.
Contribution
It offers a comprehensive evaluation of test-time scaling in medical AI, including model-specific strategies and robustness analysis under user-driven factors.
Findings
Test-time scaling improves reasoning in medical models.
Effectiveness varies with model type and task complexity.
Strategies can be refined for better robustness and interpretability.
Abstract
Test-time scaling has recently emerged as a promising approach for enhancing the reasoning capabilities of large language models or vision-language models during inference. Although a variety of test-time scaling strategies have been proposed, and interest in their application to the medical domain is growing, many critical aspects remain underexplored, including their effectiveness for vision-language models and the identification of optimal strategies for different settings. In this paper, we conduct a comprehensive investigation of test-time scaling in the medical domain. We evaluate its impact on both large language models and vision-language models, considering factors such as model size, inherent model characteristics, and task complexity. Finally, we assess the robustness of these strategies under user-driven factors, such as misleading information embedded in prompts. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Artificial Intelligence in Healthcare and Education · Machine Learning in Healthcare
