Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization

Ajwad Abrar; Farzana Tabassum; Sabbir Ahmed

arXiv:2505.05070·cs.CL·September 16, 2025

Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization

Ajwad Abrar, Farzana Tabassum, Sabbir Ahmed

PDF

TL;DR

This study evaluates the zero-shot performance of nine large language models in summarizing Bengali consumer health queries, demonstrating that LLMs can produce high-quality summaries comparable to fine-tuned models in low-resource language settings.

Contribution

It benchmarks multiple LLMs on Bengali health query summarization and shows zero-shot models can rival fine-tuned models, highlighting their potential in low-resource languages.

Findings

01

Mixtral-8x22b-Instruct achieved top ROUGE scores.

02

Bangla T5 performed best in ROUGE-2.

03

Zero-shot LLMs can produce high-quality summaries without task-specific training.

Abstract

Consumer Health Queries (CHQs) in Bengali (Bangla), a low-resource language, often contain extraneous details, complicating efficient medical responses. This study investigates the zero-shot performance of nine advanced large language models (LLMs): GPT-3.5-Turbo, GPT-4, Claude-3.5-Sonnet, Llama3-70b-Instruct, Mixtral-8x22b-Instruct, Gemini-1.5-Pro, Qwen2-72b-Instruct, Gemma-2-27b, and Athene-70B, in summarizing Bangla CHQs. Using the BanglaCHQ-Summ dataset comprising 2,350 annotated query-summary pairs, we benchmarked these LLMs using ROUGE metrics against Bangla T5, a fine-tuned state-of-the-art model. Mixtral-8x22b-Instruct emerged as the top performing model in ROUGE-1 and ROUGE-L, while Bangla T5 excelled in ROUGE-2. The results demonstrate that zero-shot LLMs can rival fine-tuned models, achieving high-quality summaries even without task-specific training. This work underscores…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Attention Dropout · Softmax · Absolute Position Encodings · Residual Connection · Linear Layer