Distilling Rule-based Knowledge into Large Language Models

Wenkai Yang; Yankai Lin; Jie Zhou; Ji-Rong Wen

arXiv:2311.08883·cs.CL·December 17, 2024·2 cites

Distilling Rule-based Knowledge into Large Language Models

Wenkai Yang, Yankai Lin, Jie Zhou, Ji-Rong Wen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a rule distillation method to encode rule-based knowledge into large language models, improving learning efficiency and generalization compared to traditional example-based training.

Contribution

It proposes a novel rule distillation approach that leverages in-context learning to explicitly encode rules into LLMs, enhancing their ability to learn from limited data.

Findings

01

Rule distillation outperforms example-based learning in sample efficiency.

02

The method improves LLMs' generalization ability.

03

Explicit rule encoding enhances learning from limited data.

Abstract

Large language models (LLMs) have shown incredible performance in completing various real-world tasks. The current paradigm of knowledge learning for LLMs is mainly based on learning from examples, in which LLMs learn the internal rule implicitly from a certain number of supervised examples. However, this learning paradigm may not well learn those complicated rules, especially when the training examples are limited. We are inspired that humans can learn the new tasks or knowledge in another way by learning from rules. That is, humans can learn new tasks or grasp new knowledge quickly and generalize well given only a detailed rule and a few optional examples. Therefore, in this paper, we aim to explore the feasibility of this new learning paradigm, which targets on encoding rule-based knowledge into LLMs. We further propose rule distillation, which first uses the strong in-context…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rucbm/rule-distillation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis