Comparison of Open-Source and Proprietary LLMs for Machine Reading Comprehension: A Practical Analysis for Industrial Applications
Mahaman Sanoussi Yahaya Alassan, Jessica L\'opez Espejel and, Merieme Bouhandi, Walid Dahhane, El Hassane Ettifouri

TL;DR
This paper systematically compares open-source and proprietary large language models for machine reading comprehension to identify cost-effective, high-performance open-source alternatives suitable for industrial applications.
Contribution
It provides a practical analysis of LLMs in industrial MCR tasks, highlighting open-source models that match proprietary performance.
Findings
Open-source LLMs can achieve comparable accuracy to proprietary models.
Certain open-source models are more suitable for industrial deployment.
The study offers guidance for selecting LLMs in real-world applications.
Abstract
Large Language Models (LLMs) have recently demonstrated remarkable performance in various Natural Language Processing (NLP) applications, such as sentiment analysis, content generation, and personalized recommendations. Despite their impressive capabilities, there remains a significant need for systematic studies concerning the practical application of LLMs in industrial settings, as well as the specific requirements and challenges related to their deployment in these contexts. This need is particularly critical for Machine Reading Comprehension (MCR), where factual, concise, and accurate responses are required. To date, most MCR rely on Small Language Models (SLMs) or Recurrent Neural Networks (RNNs) such as Long Short-Term Memory (LSTM). This trend is evident in the SQuAD2.0 rankings on the Papers with Code table. This article presents a comparative analysis between open-source LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
