Mini-Giants: "Small" Language Models and Open Source Win-Win
Zhengping Zhou, Lezhi Li, Xinxi Chen, Andy Li

TL;DR
This paper advocates for the development and adoption of small, open-source language models, highlighting their technical, ethical, and social advantages over large models like ChatGPT.
Contribution
It provides a comprehensive overview of mini-giant models, compares various small language models, and discusses their practical applications and evaluation methods.
Findings
Open source mini-giants are cost-effective alternatives to large models.
Small models can achieve competitive performance with proper techniques.
Open source community engagement accelerates development and ethical deployment.
Abstract
ChatGPT is phenomenal. However, it is prohibitively expensive to train and refine such giant models. Fortunately, small language models are flourishing and becoming more and more competent. We call them "mini-giants". We argue that open source community like Kaggle and mini-giants will win-win in many ways, technically, ethically and socially. In this article, we present a brief yet rich background, discuss how to attain small language models, present a comparative study of small language models and a brief discussion of evaluation methods, discuss the application scenarios where small language models are most needed in the real world, and conclude with discussion and outlook.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling
