A Data Fusion Framework for Multi-Domain Morality Learning

Siyi Guo; Negar Mokhberian; Kristina Lerman

arXiv:2304.02144·cs.CL·April 6, 2023·1 cites

A Data Fusion Framework for Multi-Domain Morality Learning

Siyi Guo, Negar Mokhberian, Kristina Lerman

PDF

Open Access

TL;DR

This paper introduces a data fusion framework that combines multiple heterogeneous morality datasets using domain adversarial training and weighted loss, leading to improved performance and generalization in morality inference tasks.

Contribution

The paper proposes a novel data fusion framework employing domain adversarial training and weighted loss to enhance morality recognition across diverse datasets.

Findings

01

Achieves state-of-the-art performance on morality datasets

02

Improves model generalization across domains

03

Effectively handles dataset heterogeneity

Abstract

Language models can be trained to recognize the moral sentiment of text, creating new opportunities to study the role of morality in human life. As interest in language and morality has grown, several ground truth datasets with moral annotations have been released. However, these datasets vary in the method of data collection, domain, topics, instructions for annotators, etc. Simply aggregating such heterogeneous datasets during training can yield models that fail to generalize well. We describe a data fusion framework for training on multiple heterogeneous datasets that improve performance and generalizability. The model uses domain adversarial training to align the datasets in feature space and a weighted loss function to deal with label shift. We show that the proposed framework achieves state-of-the-art performance in different datasets compared to prior works in morality inference.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Hate Speech and Cyberbullying Detection · Misinformation and Its Impacts

Methodsfail · ALIGN