Reservoir Computing as a Language Model
Felix K\"oster, Atsushi Uchida

TL;DR
This paper compares reservoir computing and transformer models for character-level language modeling, highlighting trade-offs in prediction accuracy and computational efficiency, and exploring attention-enhanced reservoirs for improved performance.
Contribution
It introduces a comparative analysis of reservoir computing approaches against transformers for language modeling, including novel attention-enhanced reservoirs, emphasizing efficiency and scalability.
Findings
Transformers outperform reservoirs in prediction quality.
Reservoir computing offers higher efficiency with faster training and inference.
Attention-enhanced reservoirs adapt dynamically, improving performance.
Abstract
Large Language Models (LLM) have dominated the science and media landscape duo to their impressive performance on processing large chunks of data and produce human-like levels of text. Nevertheless, their huge energy demand and slow processing are still a bottleneck to further increasing quality while also making the models accessible to everyone. To solve this bottleneck, we will investigate how reservoir computing performs on natural text processing, which could enable fast and energy efficient hardware implementations. Studies investigating the use of reservoir computing as a language model remain sparse. In this paper, we compare three distinct approaches for character-level language modeling, two different \emph{reservoir computing} approaches, where only an output layer is trainable, and the well-known \emph{transformer}-based architectures, which fully learn an attention-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
