A Reinforcement Learning-based Offensive semantics Censorship System for   Chatbots

Shaokang Cai; Dezhi Han; Zibin Zheng; Dun Li; NoelCrespi

arXiv:2207.10569·cs.CL·July 22, 2022

A Reinforcement Learning-based Offensive semantics Censorship System for Chatbots

Shaokang Cai, Dezhi Han, Zibin Zheng, Dun Li, NoelCrespi

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning-based system for offensive semantics censorship in chatbots, aiming to detect and purify offensive content efficiently while maintaining reply quality.

Contribution

It presents a novel semantics censorship framework combining offensive semantics detection and purification, with an integrated few-shot learning approach for rapid training.

Findings

01

Reduces offensive reply generation probability in chatbots.

02

Accelerates semantics purification with a once-through learning method.

03

Improves training speed while maintaining reply quality.

Abstract

The rapid development of artificial intelligence (AI) technology has enabled large-scale AI applications to land in the market and practice. However, while AI technology has brought many conveniences to people in the productization process, it has also exposed many security issues. Especially, attacks against online learning vulnerabilities of chatbots occur frequently. Therefore, this paper proposes a semantics censorship chatbot system based on reinforcement learning, which is mainly composed of two parts: the Offensive semantics censorship model and the semantics purification model. Offensive semantics review can combine the context of user input sentences to detect the rapid evolution of Offensive semantics and respond to Offensive semantics responses. The semantics purification model For the case of chatting robot models, it has been contaminated by large numbers of offensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Ethics and Social Impacts of AI

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings