Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases   Generation with Small Language Models

Ioana Buhnila; Aman Sinha; and Mathieu Constant

arXiv:2407.16565·cs.CL·July 24, 2024

Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models

Ioana Buhnila, Aman Sinha, and Mathieu Constant

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces pRAGe, a pipeline utilizing Small Language Models and external knowledge bases to generate and evaluate medical paraphrases in French, addressing issues of hallucination and resource constraints.

Contribution

The work presents a novel pipeline combining retrieval and generation with small models for medical paraphrases, emphasizing effectiveness and resource efficiency.

Findings

01

Small Language Models can effectively generate medical paraphrases with external knowledge.

02

Retrieval-augmented generation improves factual accuracy in medical text.

03

The pipeline demonstrates promising results in French medical language processing.

Abstract

Recent surge in the accessibility of large language models (LLMs) to the general population can lead to untrackable use of such models for medical-related recommendations. Language generation via LLMs models has two key problems: firstly, they are prone to hallucination and therefore, for any medical purpose they require scientific and factual grounding; secondly, LLMs pose tremendous challenge to computational resources due to their gigantic model size. In this work, we introduce pRAGe, a pipeline for Retrieval Augmented Generation and evaluation of medical paraphrases generation using Small Language Models (SLM). We study the effectiveness of SLMs and the impact of external knowledge base for medical paraphrase generation in French.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ATILF-UMR7118/pRAGe
noneOfficial

Videos

Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsBalanced Selection