From Morality Installation in LLMs to LLMs in Morality-as-a-System
Gunter Bombaerts

TL;DR
This paper proposes a new morality-as-a-system framework for LLMs, viewing moral behavior as a dynamic, emergent property of interconnected components, addressing limitations of traditional installation paradigms.
Contribution
It introduces a sociotechnical system perspective on LLM morality, emphasizing continuous reproduction and structural coupling, and offers a conceptual framework for lifecycle monitoring and governance.
Findings
Reframes morality in LLMs as a dynamic system rather than fixed at training.
Identifies structural coupling failures as root causes of interpretability and governance issues.
Suggests hypotheses for technical research and governance improvements.
Abstract
Work on morality in large language models (LLMs) has progressed via constitutional AI, reinforcement learning from human feedback (RLHF) and systematic benchmarking, yet it still lacks tools to connect internal moral representations to regulatory obligations, to design cultural plurality across the full development stack, and to monitor how moral properties drift over the lifecycle of a deployed system. These difficulties reflect a shared root. Morality is installed in a model at training time. I propose instead a morality-as-a-system framework, grounded in Niklas Luhmann's social systems theory, that treats LLM morality as a dynamic, emergent property of a sociotechnical system. Moral behaviour in a deployed LLM is not fixed at training. It is continuously reproduced through interactions among seven structurally coupled components spanning the neural substrate, training data, alignment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Embodied and Extended Cognition
