What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts   and Rationales for Disambiguating Defeasible Social and Moral Situations

Kavel Rao; Liwei Jiang; Valentina Pyatkin; Yuling Gu; Niket Tandon,; Nouha Dziri; Faeze Brahman; Yejin Choi

arXiv:2310.15431·cs.CL·May 24, 2024·1 cites

What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations

Kavel Rao, Liwei Jiang, Valentina Pyatkin, Yuling Gu, Niket Tandon,, Nouha Dziri, Faeze Brahman, Yejin Choi

PDF

Open Access

TL;DR

This paper introduces a novel iterative self-distillation method to generate high-quality, diverse contextual explanations for moral judgments, enhancing the understanding of nuanced human moral reasoning.

Contribution

It presents a new approach combining self-distillation, filtering, and imitation learning to improve the validity and diversity of moral reasoning contexts and rationales.

Findings

01

Produced a dataset of 1.2 million contextualized moral judgments

02

Achieved high human agreement rates of 85.9% to 99.8%

03

Final model outperforms intermediate models significantly

Abstract

Moral or ethical judgments rely heavily on the specific contexts in which they occur. Understanding varying shades of defeasible contextualizations (i.e., additional information that strengthens or attenuates the moral acceptability of an action) is critical to accurately represent the subtlety and intricacy of grounded human moral judgment in real-life scenarios. We introduce defeasible moral reasoning: a task to provide grounded contexts that make an action more or less morally acceptable, along with commonsense rationales that justify the reasoning. To elicit high-quality task data, we take an iterative self-distillation approach that starts from a small amount of unstructured seed knowledge from GPT-3 and then alternates between (1) self-distillation from student models; (2) targeted filtering with a critic model trained by human judgment (to boost validity) and NLI (to boost…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Explainable Artificial Intelligence (XAI)

MethodsMulti-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Byte Pair Encoding · Dropout · Weight Decay · Layer Normalization · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Cosine Annealing