Machine Learning Approaches for Principle Prediction in Naturally Occurring Stories
Md Sultan Al Nahian, Spencer Frazier, Brent Harrison, Mark Riedl

TL;DR
This paper investigates machine learning models for predicting nuanced moral principles from real-world stories, highlighting challenges in interpreting moral ambiguity for autonomous systems and humans alike.
Contribution
It extends a moral principles dataset and evaluates machine learning models against human performance in principle prediction from stories.
Findings
Models can classify individual principles
Moral ambiguity challenges both humans and machines
Machine learning approaches show promise but face interpretative limits
Abstract
Value alignment is the task of creating autonomous systems whose values align with those of humans. Past work has shown that stories are a potentially rich source of information on human values; however, past work has been limited to considering values in a binary sense. In this work, we explore the use of machine learning models for the task of normative principle prediction on naturally occurring story data. To do this, we extend a dataset that has been previously used to train a binary normative classifier with annotations of moral principles. We then use this dataset to train a variety of machine learning models, evaluate these models and compare their results against humans who were asked to perform the same task. We show that while individual principles can be classified, the ambiguity of what "moral principles" represent, poses a challenge for both human participants and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPsychology of Moral and Emotional Judgment · Law in Society and Culture
MethodsALIGN
