Detection of Slang Words in e-Data using semi-Supervised Learning
Alok Ranjan Pal, Diganta Saha

TL;DR
This paper presents a semi-supervised learning approach to detect slang and abbreviated forms of words in electronic data, leveraging synset and concept analysis to evaluate the likelihood of words being slang.
Contribution
It introduces a novel semi-supervised method for detecting slang and abbreviations, including sound-alike and taboo forms, in electronic communication data.
Findings
Effective detection of slang and abbreviations in real-world data.
Utilizes synset and concept analysis to improve accuracy.
Addresses the challenge of incomplete slang forms in communication.
Abstract
The proposed algorithmic approach deals with finding the sense of a word in an electronic data. Now a day,in different communication mediums like internet, mobile services etc. people use few words, which are slang in nature. This approach detects those abusive words using supervised learning procedure. But in the real life scenario, the slang words are not used in complete word forms always. Most of the times, those words are used in different abbreviated forms like sounds alike forms, taboo morphemes etc. This proposed approach can detect those abbreviated forms also using semi supervised learning procedure. Using the synset and concept analysis of the text, the probability of a suspicious word to be a slang word is also evaluated.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
