GEB-1.3B: Open Lightweight Large Language Model

Jie Wu; Yufeng Zhu; Lei Shen; Xuqing Lu

arXiv:2406.09900·cs.CL·June 17, 2024

GEB-1.3B: Open Lightweight Large Language Model

Jie Wu, Yufeng Zhu, Lei Shen, Xuqing Lu

PDF

Open Access 1 Models

TL;DR

GEB-1.3B is an open-source, lightweight large language model optimized for CPU inference, achieving competitive performance with novel training techniques and fine-tuning, suitable for efficient deployment.

Contribution

This work introduces GEB-1.3B, a resource-efficient LLM trained with innovative methods, and demonstrates its strong performance and open-source release for lightweight NLP applications.

Findings

01

Outperforms models like MindLLM-1.3B and TinyLLaMA-1.1B on benchmarks.

02

Achieves good inference times on CPUs with FP32 version.

03

Utilizes novel training techniques like ROPE, Group-Query-Attention, and FlashAttention-2.

Abstract

Recently developed large language models (LLMs) such as ChatGPT, Claude, and Llama have demonstrated impressive abilities, and even surpass human-level performance in several tasks. Despite their success, the resource-intensive demands of these models, requiring significant computational power for both training and inference, limit their deployment to high-performance servers. Additionally, the extensive calculation requirements of the models often lead to increased latency in response times. With the increasing need for LLMs to operate efficiently on CPUs, research about lightweight models that are optimized for CPU inference has emerged. In this work, we introduce GEB-1.3B, a lightweight LLM trained on 550 billion tokens in both Chinese and English languages. We employ novel training techniques, including ROPE, Group-Query-Attention, and FlashAttention-2, to accelerate training while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
GEB-AGI/geb-1.3b
model· 26 dl· ♡ 18
26 dl♡ 18

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Mathematics, Computing, and Information Processing · Computational Physics and Python Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · LLaMA