Large Language Models with Controllable Working Memory
Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal, Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar

TL;DR
This paper investigates how large language models manage internal knowledge versus contextual information, revealing controllability and robustness issues, and proposes KAFT to improve these properties by incorporating counterfactual contexts.
Contribution
The paper introduces KAFT, a novel fine-tuning method that enhances LLM controllability and robustness by integrating counterfactual and irrelevant contexts during training.
Findings
State-of-the-art models show poor controllability and robustness.
KAFT improves model performance across architectures and sizes.
Controllability and robustness do not scale with model size.
Abstract
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP), owing to their excellent understanding and generation abilities. Remarkably, what further sets these models apart is the massive amounts of world knowledge they internalize during pretraining. While many downstream applications provide the model with an informational context to aid its performance on the underlying task, how the model's world knowledge interacts with the factual information presented in the context remains under explored. As a desirable behavior, an LLM should give precedence to the context whenever it contains task-relevant information that conflicts with the model's memorized knowledge. This enables model predictions to be grounded in the context, which can then be used to update or correct specific model predictions without frequent retraining. By contrast, when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
MethodsMulti-Head Attention · Attention Is All You Need · Pathways Language Model · Byte Pair Encoding · Residual Connection · Dropout · Attention Dropout · Adafactor · Dense Connections · Softmax
