Misaligned from Within: Large Language Models Reproduce Our Double-Loop Learning Blindness
Tim Rogers, Ben Teehankee

TL;DR
This paper explores how Large Language Models tend to reproduce human cognitive biases, specifically Model 1 theories-in-use, which can hinder organizational learning and reinforce anti-learning behaviors.
Contribution
It identifies the inheritance of Model 1 theories-in-use in LLMs and demonstrates how this leads to reinforcement of unproductive problem-solving in organizational contexts.
Findings
LLMs can inherit and amplify human cognitive biases.
An LLM acting as an HR consultant reinforces anti-learning practices.
Potential pathways to develop LLMs that support deeper organizational learning.
Abstract
This paper examines a critical yet unexplored dimension of the AI alignment problem: the potential for Large Language Models (LLMs) to inherit and amplify existing misalignments between human espoused theories and theories-in-use. Drawing on action science research, we argue that LLMs trained on human-generated text likely absorb and reproduce Model 1 theories-in-use - a defensive reasoning pattern that both inhibits learning and creates ongoing anti-learning dynamics at the dyad, group, and organisational levels. Through a detailed case study of an LLM acting as an HR consultant, we show how its advice, while superficially professional, systematically reinforces unproductive problem-solving approaches and blocks pathways to deeper organisational learning. This represents a specific instance of the alignment problem where the AI system successfully mirrors human behaviour but inherits…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
