Loading paper
Performance of Large Language Models on Cognitive Aptitude Testing: A Multi-Run Evaluation on the German Medical School Admission Test (TMS) | Tomesphere