Northeastern Uni at Multilingual Counterspeech Generation: Enhancing   Counter Speech Generation with LLM Alignment through Direct Preference   Optimization

Sahil Wadhwa; Chengtian Xu; Haoming Chen; Aakash Mahalingam; Akankshya; Kar; Divya Chaudhary

arXiv:2412.15453·cs.CL·December 23, 2024

Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization

Sahil Wadhwa, Chengtian Xu, Haoming Chen, Aakash Mahalingam, Akankshya, Kar, Divya Chaudhary

PDF

Open Access

TL;DR

This paper introduces a novel approach to improve multilingual counter-speech generation by aligning large language models with human preferences using Direct Preference Optimization, resulting in more impactful and contextually appropriate responses.

Contribution

It presents a new methodology combining Supervised Fine-Tuning and Direct Preference Optimization to enhance counter-speech generation across multiple languages.

Findings

01

DPO-aligned models outperform SFT baselines on CS benchmarks.

02

The approach scales effectively to languages like Basque, Italian, and Spanish.

03

Knowledge grounding improves factual accuracy of generated responses.

Abstract

The automatic generation of counter-speech (CS) is a critical strategy for addressing hate speech by providing constructive and informed responses. However, existing methods often fail to generate high-quality, impactful, and scalable CS, particularly across diverse linguistic contexts. In this paper, we propose a novel methodology to enhance CS generation by aligning Large Language Models (LLMs) using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). Our approach leverages DPO to align LLM outputs with human preferences, ensuring contextually appropriate and linguistically adaptable responses. Additionally, we incorporate knowledge grounding to enhance the factual accuracy and relevance of generated CS. Experimental results demonstrate that DPO-aligned models significantly outperform SFT baselines on CS benchmarks while scaling effectively to multiple languages.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Phonetics and Phonology Research · Speech and dialogue systems

MethodsDirect Preference Optimization · Shrink and Fine-Tune · ALIGN