From Model Training to Model Raising
Roland Aydin, Christian Cyron, Steve Bachelor, Ashton Anderson, Robert West

TL;DR
This paper introduces a new paradigm called 'model raising' that integrates alignment and value embedding into the early stages of AI model development by redesigning training data to foster intrinsic value systems from the outset.
Contribution
It proposes a novel approach to AI training that embeds values during initial development, contrasting with traditional methods that align models post-training.
Findings
Redesigning training data to include lived experience and social interactions promotes early value embedding.
Early value commitment makes knowledge, skills, and values more intrinsically linked.
This approach aims to create models with deep-rooted alignment from the first training token.
Abstract
Current AI training methods align models with human values only after their core capabilities have been established, resulting in models that are easily misaligned and lack deep-rooted value systems. We propose a paradigm shift from "model training" to "model raising", in which alignment is woven into a model's development from the start. We identify several key components for this paradigm, all centered around redesigning the training corpus: reframing training data from a first-person perspective, recontextualizing information as lived experience, simulating social interactions, and scaffolding the ordering of training data. We expect that this redesign of the training corpus will lead to an early commitment to values from the first training token onward, such that knowledge, skills, and values are intrinsically much harder to separate. In an ecosystem in which large language model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Multimodal Machine Learning Applications
