UM_FHS at TREC 2024 PLABA: Exploration of Fine-tuning and AI agent   approach for plain language adaptations of biomedical text

Primoz Kocbek; Leon Kopitar; Zhihong Zhang; Emirhan Aydin; Maxim; Topaz; Gregor Stiglic

arXiv:2502.14144·cs.CL·February 21, 2025

UM_FHS at TREC 2024 PLABA: Exploration of Fine-tuning and AI agent approach for plain language adaptations of biomedical text

Primoz Kocbek, Leon Kopitar, Zhihong Zhang, Emirhan Aydin, Maxim, Topaz, Gregor Stiglic

PDF

Open Access

TL;DR

This paper explores methods for simplifying biomedical abstracts for young students using AI, comparing prompt engineering, a two-agent system, and fine-tuning with GPT models, highlighting the strengths and weaknesses of each approach.

Contribution

It introduces a comparative analysis of three AI-based methods for biomedical text simplification, emphasizing the effectiveness of prompt engineering with GPT-4 models.

Findings

01

Prompt engineering with gpt-4o-mini yields best qualitative simplicity scores.

02

Fine-tuning improves accuracy and completeness but reduces simplicity.

03

Two-agent approach performs well but is less effective than prompt engineering.

Abstract

This paper describes our submissions to the TREC 2024 PLABA track with the aim to simplify biomedical abstracts for a K8-level audience (13-14 years old students). We tested three approaches using OpenAI's gpt-4o and gpt-4o-mini models: baseline prompt engineering, a two-AI agent approach, and fine-tuning. Adaptations were evaluated using qualitative metrics (5-point Likert scales for simplicity, accuracy, completeness, and brevity) and quantitative readability scores (Flesch-Kincaid grade level, SMOG Index). Results indicated that the two-agent approach and baseline prompt engineering with gpt-4o-mini models show superior qualitative performance, while fine-tuned models excelled in accuracy and completeness but were less simple. The evaluation results demonstrated that prompt engineering with gpt-4o-mini outperforms iterative improvement strategies via two-agent approach as well as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling