Formal Aspects of Language Modeling
Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, Li Du

TL;DR
This paper explores the formal mathematical foundations of large language models, providing theoretical insights crucial for developers and researchers to understand and implement these models effectively.
Contribution
It offers a formal, theoretical perspective on language models, complementing practical and empirical approaches in NLP.
Findings
Provides a formal framework for understanding language models
Clarifies the mathematical principles underlying large language models
Serves as a theoretical foundation for further research
Abstract
Large language models have become one of the most commonly deployed NLP inventions. In the past half-decade, their integration into core natural language processing tools has dramatically increased the performance of such tools, and they have entered the public discourse surrounding artificial intelligence. Consequently, it is important for both developers and researchers alike to understand the mathematical foundations of large language models, as well as how to implement them. These notes are the accompaniment to the theoretical portion of the ETH Z\"urich course on large language models, covering what constitutes a language model from a formal, theoretical perspective.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
