Counting Clues: A Lightweight Probabilistic Baseline Can Match an LLM

Furong Jia; Yuan Pu; Finn Guo; Monica Agrawal

arXiv:2512.12868·cs.CL·December 16, 2025

Counting Clues: A Lightweight Probabilistic Baseline Can Match an LLM

Furong Jia, Yuan Pu, Finn Guo, Monica Agrawal

PDF

Open Access

TL;DR

This paper introduces a simple probabilistic method, FBPR, that matches LLM performance on clinical diagnosis tasks, highlighting the value of explicit probabilistic baselines and their complementary strengths.

Contribution

The paper presents FBPR, a lightweight Naive Bayes-based approach that achieves comparable results to large language models using co-occurrence statistics from pretraining corpora.

Findings

01

FBPR matches LLM performance on MedQA diagnosis tasks.

02

FBPR and LLMs correctly answer different questions, showing complementary strengths.

03

Explicit probabilistic methods still account for significant performance in benchmark tasks.

Abstract

Large language models (LLMs) excel on multiple-choice clinical diagnosis benchmarks, yet it is unclear how much of this performance reflects underlying probabilistic reasoning. We study this through questions from MedQA, where the task is to select the most likely diagnosis. We introduce the Frequency-Based Probabilistic Ranker (FBPR), a lightweight method that scores options with a smoothed Naive Bayes over concept-diagnosis co-occurrence statistics from a large corpus. When co-occurrence statistics were sourced from the pretraining corpora for OLMo and Llama, FBPR achieves comparable performance to the corresponding LLMs pretrained on that same corpus. Direct LLM inference and FBPR largely get different questions correct, with an overlap only slightly above random chance, indicating complementary strengths of each method. These findings highlight the continued value of explicit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Topic Modeling · Artificial Intelligence in Healthcare and Education