Mini-Giants: "Small" Language Models and Open Source Win-Win

Zhengping Zhou; Lezhi Li; Xinxi Chen; Andy Li

arXiv:2307.08189·cs.CL·July 9, 2024·2 cites

Mini-Giants: "Small" Language Models and Open Source Win-Win

Zhengping Zhou, Lezhi Li, Xinxi Chen, Andy Li

PDF

Open Access

TL;DR

This paper advocates for the development and adoption of small, open-source language models, highlighting their technical, ethical, and social advantages over large models like ChatGPT.

Contribution

It provides a comprehensive overview of mini-giant models, compares various small language models, and discusses their practical applications and evaluation methods.

Findings

01

Open source mini-giants are cost-effective alternatives to large models.

02

Small models can achieve competitive performance with proper techniques.

03

Open source community engagement accelerates development and ethical deployment.

Abstract

ChatGPT is phenomenal. However, it is prohibitively expensive to train and refine such giant models. Fortunately, small language models are flourishing and becoming more and more competent. We call them "mini-giants". We argue that open source community like Kaggle and mini-giants will win-win in many ways, technically, ethically and socially. In this article, we present a brief yet rich background, discuss how to attain small language models, present a comparative study of small language models and a brief discussion of evaluation methods, discuss the application scenarios where small language models are most needed in the real world, and conclude with discussion and outlook.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling