FZI-WIM at SemEval-2024 Task 2: Self-Consistent CoT for Complex NLI in   Biomedical Domain

Jin Liu; Steffen Thoma

arXiv:2406.10040·cs.CL·June 17, 2024

FZI-WIM at SemEval-2024 Task 2: Self-Consistent CoT for Complex NLI in Biomedical Domain

Jin Liu, Steffen Thoma

PDF

Open Access 1 Repo

TL;DR

This paper presents a self-consistent chain of thought approach for biomedical natural language inference, improving reasoning accuracy by sampling multiple chains and using majority voting, achieving top performance in SemEval-2024.

Contribution

It introduces a self-consistent CoT method with multiple sampling and voting, enhancing reasoning in biomedical NLI tasks over previous approaches.

Findings

01

Achieved a baseline F1 score of 0.80, ranking 1st.

02

Attained a faithfulness score of 0.90, ranking 3rd.

03

Reached a consistency score of 0.73, ranking 12th.

Abstract

This paper describes the inference system of FZI-WIM at the SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials. Our system utilizes the chain of thought (CoT) paradigm to tackle this complex reasoning problem and further improves the CoT performance with self-consistency. Instead of greedy decoding, we sample multiple reasoning chains with the same prompt and make the final verification with majority voting. The self-consistent CoT system achieves a baseline F1 score of 0.80 (1st), faithfulness score of 0.90 (3rd), and consistency score of 0.73 (12th). We release the code and data publicly https://github.com/jens5588/FZI-WIM-NLI4CT.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jens5588/fzi-wim-nli4ct
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Natural Language Processing Techniques · Topic Modeling