Towards Concept-Aware Large Language Models

Chen Shani; Jilles Vreeken; Dafna Shahaf

arXiv:2311.01866·cs.CL·November 6, 2023·1 cites

Towards Concept-Aware Large Language Models

Chen Shani, Jilles Vreeken, Dafna Shahaf

PDF

Open Access 1 Repo

TL;DR

This paper investigates how well current large language models understand human concepts, proposes methods to develop concept-aware models, and demonstrates that concept integration can improve model robustness and alignment with human intuition.

Contribution

It introduces a framework for developing concept-aware LLMs through pretraining and output-based methods, enhancing their interpretability and robustness.

Findings

01

Concept-aware LLMs better match human intuition

02

Pretraining with concepts improves model robustness

03

Output-based approaches show promising results

Abstract

Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication. However, there is very little work on endowing machines with the ability to form and reason with concepts. In particular, state-of-the-art large language models (LLMs) work at the level of tokens, not concepts. In this work, we analyze how well contemporary LLMs capture human concepts and their structure. We then discuss ways to develop concept-aware LLMs, taking place at different stages of the pipeline. We sketch a method for pretraining LLMs using concepts, and also explore the simpler approach that uses the output of existing LLMs. Despite its simplicity, our proof-of-concept is shown to better match human intuition, as well as improve the robustness of predictions. These preliminary results underscore the promise of concept-aware LLMs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chenxshani/towards-concept-aware-llms
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems