Predicting LLM Correctness in Prosthodontics Using Metadata and Hallucination Signals
Lucky Susanto, Anasta Pranawijayana, Cortino Sukotjo, Soni Prasad, Derry Wijaya

TL;DR
This paper explores predicting the correctness of LLM responses in prosthodontics using metadata and hallucination signals, demonstrating modest accuracy improvements and revealing insights into model behavior and reliability signals.
Contribution
It introduces a novel approach combining metadata and hallucination signals to predict LLM correctness in high-stakes medical domains, highlighting the impact of prompting strategies.
Findings
Metadata-based approach improves accuracy by up to 7.14%
Achieves 83.12% precision over baseline
Hallucination signals are strong indicators of incorrectness
Abstract
Large language models (LLMs) are increasingly adopted in high-stakes domains such as healthcare and medical education, where the risk of generating factually incorrect (i.e., hallucinated) information is a major concern. While significant efforts have been made to detect and mitigate such hallucinations, predicting whether an LLM's response is correct remains a critical yet underexplored problem. This study investigates the feasibility of predicting correctness by analyzing a general-purpose model (GPT-4o) and a reasoning-centric model (OSS-120B) on a multiple-choice prosthodontics exam. We utilize metadata and hallucination signals across three distinct prompting strategies to build a correctness predictor for each (model, prompting) pair. Our findings demonstrate that this metadata-based approach can improve accuracy by up to +7.14% and achieve a precision of 83.12% over a baseline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Topic Modeling
