Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization
Ajwad Abrar, Farzana Tabassum, Sabbir Ahmed

TL;DR
This study evaluates the zero-shot performance of nine large language models in summarizing Bengali consumer health queries, demonstrating that LLMs can produce high-quality summaries comparable to fine-tuned models in low-resource language settings.
Contribution
It benchmarks multiple LLMs on Bengali health query summarization and shows zero-shot models can rival fine-tuned models, highlighting their potential in low-resource languages.
Findings
Mixtral-8x22b-Instruct achieved top ROUGE scores.
Bangla T5 performed best in ROUGE-2.
Zero-shot LLMs can produce high-quality summaries without task-specific training.
Abstract
Consumer Health Queries (CHQs) in Bengali (Bangla), a low-resource language, often contain extraneous details, complicating efficient medical responses. This study investigates the zero-shot performance of nine advanced large language models (LLMs): GPT-3.5-Turbo, GPT-4, Claude-3.5-Sonnet, Llama3-70b-Instruct, Mixtral-8x22b-Instruct, Gemini-1.5-Pro, Qwen2-72b-Instruct, Gemma-2-27b, and Athene-70B, in summarizing Bangla CHQs. Using the BanglaCHQ-Summ dataset comprising 2,350 annotated query-summary pairs, we benchmarked these LLMs using ROUGE metrics against Bangla T5, a fine-tuned state-of-the-art model. Mixtral-8x22b-Instruct emerged as the top performing model in ROUGE-1 and ROUGE-L, while Bangla T5 excelled in ROUGE-2. The results demonstrate that zero-shot LLMs can rival fine-tuned models, achieving high-quality summaries even without task-specific training. This work underscores…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Attention Dropout · Softmax · Absolute Position Encodings · Residual Connection · Linear Layer
