Formal Aspects of Language Modeling

Ryan Cotterell; Anej Svete; Clara Meister; Tianyu Liu; Li Du

arXiv:2311.04329·cs.CL·April 18, 2024·1 cites

Formal Aspects of Language Modeling

Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, Li Du

PDF

Open Access

TL;DR

This paper explores the formal mathematical foundations of large language models, providing theoretical insights crucial for developers and researchers to understand and implement these models effectively.

Contribution

It offers a formal, theoretical perspective on language models, complementing practical and empirical approaches in NLP.

Findings

01

Provides a formal framework for understanding language models

02

Clarifies the mathematical principles underlying large language models

03

Serves as a theoretical foundation for further research

Abstract

Large language models have become one of the most commonly deployed NLP inventions. In the past half-decade, their integration into core natural language processing tools has dramatically increased the performance of such tools, and they have entered the public discourse surrounding artificial intelligence. Consequently, it is important for both developers and researchers alike to understand the mathematical foundations of large language models, as well as how to implement them. These notes are the accompaniment to the theoretical portion of the ETH Z\"urich course on large language models, covering what constitutes a language model from a formal, theoretical perspective.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques