Employing Sentence Space Embedding for Classification of Data Stream from Fake News Domain

Pawe{\l} Zyblewski; Jakub Klikowski; Weronika Borek-Marciniec; Pawe{\l} Ksieniewicz

arXiv:2407.10807·cs.CL·November 11, 2025

Employing Sentence Space Embedding for Classification of Data Stream from Fake News Domain

Pawe{\l} Zyblewski, Jakub Klikowski, Weronika Borek-Marciniec, Pawe{\l} Ksieniewicz

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel approach for classifying fake news in data streams by encoding text into digital signals using sentence space embedding, enabling the use of convolutional neural networks typically used for images.

Contribution

It is the first to apply sentence space embedding for natural language data stream classification, bridging NLP and deep learning techniques for fake news detection.

Findings

01

Outperforms existing algorithms in generalization ability

02

Demonstrates competitive time complexity

03

Effective in real-world fake news detection scenarios

Abstract

Tabular data is considered the last unconquered castle of deep learning, yet the task of data stream classification is stated to be an equally important and demanding research area. Due to the temporal constraints, it is assumed that deep learning methods are not the optimal solution for application in this field. However, excluding the entire -- and prevalent -- group of methods seems rather rash given the progress that has been made in recent years in its development. For this reason, the following paper is the first to present an approach to natural language data stream classification using the sentence space method, which allows for encoding text into the form of a discrete digital signal. This allows the use of convolutional deep networks dedicated to image classification to solve the task of recognizing fake news based on text data. Based on the real-life Fakeddit dataset, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

w4k2/sentence-space-stream
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Spam and Phishing Detection