Textbooks Are All You Need II: phi-1.5 technical report
Yuanzhi Li, S\'ebastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya, Gunasekar, Yin Tat Lee

TL;DR
This paper introduces phi-1.5, a 1.3 billion parameter Transformer model trained exclusively on textbook-like data, achieving performance comparable to larger models and demonstrating capabilities in reasoning, coding, and mathematics.
Contribution
The paper presents phi-1.5, a novel smaller language model trained on textbook data, showing competitive performance and insights into model traits and biases without web data.
Findings
phi-1.5 performs on par with larger models on natural language tasks
It surpasses many non-frontier LLMs in reasoning and coding tasks
The model exhibits both strengths and weaknesses similar to larger models
Abstract
We continue the investigation into the power of smaller Transformer-based language models as initiated by \textbf{TinyStories} -- a 10 million parameter model that can produce coherent English -- and the follow-up work on \textbf{phi-1}, a 1.3 billion parameter model with Python coding performance close to the state-of-the-art. The latter work proposed to use existing Large Language Models (LLMs) to generate ``textbook quality" data as a way to enhance the learning process compared to traditional web data. We follow the ``Textbooks Are All You Need" approach, focusing this time on common sense reasoning in natural language, and create a new 1.3 billion parameter model named \textbf{phi-1.5}, with performance on natural language tasks comparable to models 5x larger, and surpassing most non-frontier LLMs on more complex reasoning tasks such as grade-school mathematics and basic coding.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗microsoft/phi-1_5model· 97k dl· ♡ 135597k dl♡ 1355
- 🤗typeof/phi-1_5model· 5 dl5 dl
- 🤗typeof/phi-1_5_forevalmodel· 4 dl4 dl
- 🤗Open-Orca/oo-phi-1_5model· 23 dl· ♡ 3223 dl♡ 32
- 🤗jncraton/phi-1_5-ct2-int8model· 2 dl2 dl
- 🤗hamadandrabi/Microsoft_Phi_gsm8kmodel· 3 dl3 dl
- 🤗michaelfeil/ct2fast-phi-1_5model· 3 dl3 dl
- 🤗OpenNMT/phi-1_5-ct2-int8model· 5 dl· ♡ 15 dl♡ 1
- 🤗mariordoniez/phimodel· 2 dl2 dl
- 🤗TKDKid1000/phi-1_5-GGUFmodel· 908 dl· ♡ 8908 dl♡ 8
Videos
Phi-2, Imagen-2, Optimus-Gen-2: Small New Models to Change the World?· youtube
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
