Can Open-Source LLMs Compete with Commercial Models? Exploring the Few-Shot Performance of Current GPT Models in Biomedical Tasks
Samy Ateia, Udo Kruschwitz

TL;DR
This study evaluates the few-shot and zero-shot performance of open-source and commercial GPT models in biomedical retrieval tasks, finding that few-shot learning narrows the gap, especially in domain-specific applications.
Contribution
It provides a comparative analysis of current GPT models and open-source alternatives in biomedical NLP, highlighting the effectiveness of few-shot learning in closing performance gaps.
Findings
Mixtral 8x7b is competitive in 10-shot settings.
Zero-shot performance of open-source models is significantly lower.
Few-shot examples improve domain-specific task performance.
Abstract
Commercial large language models (LLMs), like OpenAI's GPT-4 powering ChatGPT and Anthropic's Claude 3 Opus, have dominated natural language processing (NLP) benchmarks across different domains. New competing Open-Source alternatives like Mixtral 8x7B or Llama 3 have emerged and seem to be closing the gap while often offering higher throughput and being less costly to use. Open-Source LLMs can also be self-hosted, which makes them interesting for enterprise and clinical use cases where sensitive data should not be processed by third parties. We participated in the 12th BioASQ challenge, which is a retrieval augmented generation (RAG) setting, and explored the performance of current GPT models Claude 3 Opus, GPT-3.5-turbo and Mixtral 8x7b with in-context learning (zero-shot, few-shot) and QLoRa fine-tuning. We also explored how additional relevant knowledge from Wikipedia added to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Warmup With Linear Decay · Cosine Annealing · Label Smoothing · Linear Layer · BART · Weight Decay · Softmax
