SylloBio-NLI: Evaluating Large Language Models on Biomedical Syllogistic   Reasoning

Magdalena Wysocka; Danilo Carvalho; Oskar Wysocki; Marco Valentino,; Andre Freitas

arXiv:2410.14399·cs.CL·February 11, 2025

SylloBio-NLI: Evaluating Large Language Models on Biomedical Syllogistic Reasoning

Magdalena Wysocka, Danilo Carvalho, Oskar Wysocki, Marco Valentino,, Andre Freitas

PDF

Open Access 1 Video

TL;DR

This paper introduces SylloBio-NLI, a framework for evaluating large language models on biomedical syllogistic reasoning, revealing significant challenges and the impact of prompting techniques on model performance.

Contribution

The paper presents a novel framework for biomedical syllogistic reasoning evaluation and provides extensive analysis of LLMs' capabilities and limitations in this domain.

Findings

01

Zero-shot LLM accuracy ranges from 70% to 23% across schemes.

02

Few-shot prompting improves performance significantly.

03

Models are highly sensitive to superficial lexical variations.

Abstract

Syllogistic reasoning is crucial for Natural Language Inference (NLI). This capability is particularly significant in specialized domains such as biomedicine, where it can support automatic evidence interpretation and scientific discovery. This paper presents SylloBio-NLI, a novel framework that leverages external ontologies to systematically instantiate diverse syllogistic arguments for biomedical NLI. We employ SylloBio-NLI to evaluate Large Language Models (LLMs) on identifying valid conclusions and extracting supporting evidence across 28 syllogistic schemes instantiated with human genome pathways. Extensive experiments reveal that biomedical syllogistic reasoning is particularly challenging for zero-shot LLMs, which achieve an average accuracy between 70% on generalized modus ponens and 23% on disjunctive syllogism. At the same time, we found that few-shot prompting can boost the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SylloBio-NLI: Evaluating Large Language Models on Biomedical Syllogistic Reasoning· underline

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Machine Learning in Healthcare