HARE: HumAn pRiors, a key to small language model Efficiency

Lingyun Zhang; Bin jin; Gaojian Ge; Lunhui Liu; Xuewen Shen; Mingyong; Wu; Houqian Zhang; Yongneng Jiang; Shiqi Chen; Shi Pu

arXiv:2406.11410·cs.CL·June 19, 2024

HARE: HumAn pRiors, a key to small language model Efficiency

Lingyun Zhang, Bin jin, Gaojian Ge, Lunhui Liu, Xuewen Shen, Mingyong, Wu, Houqian Zhang, Yongneng Jiang, Shiqi Chen, Shi Pu

PDF

Open Access 1 Models

TL;DR

This paper introduces HARE, a method leveraging human priors to construct concise, high-quality training data for small language models, improving efficiency and performance in resource-limited settings.

Contribution

It proposes a principle for incorporating human priors into data construction and demonstrates its effectiveness with the HARE-1.1B model, outperforming existing small language models.

Findings

01

HARE-1.1B achieves competitive results on benchmark datasets.

02

Using human priors enhances training efficiency in resource-constrained environments.

03

The principle guides effective data construction for small language models.

Abstract

Human priors play a crucial role in efficiently utilizing data in deep learning. However, with the development of large language models (LLMs), there is an increasing emphasis on scaling both model size and data volume, which often diminishes the importance of human priors in data construction. Influenced by these trends, existing Small Language Models (SLMs) mainly rely on web-scraped large-scale training data, neglecting the proper incorporation of human priors. This oversight limits the training efficiency of language models in resource-constrained settings. In this paper, we propose a principle to leverage human priors for data construction. This principle emphasizes achieving high-performance SLMs by training on a concise dataset that accommodates both semantic diversity and data quality consistency, while avoiding benchmark data leakage. Following this principle, we train an SLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
RichardErkhov/LiteAI_-_Hare-1.1B-base-gguf
model· 470 dl
470 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling