Textual Data Mining for Financial Fraud Detection: A Deep Learning Approach
Qiuru Li

TL;DR
This paper explores deep learning models for NLP-based financial fraud detection, comparing various neural networks to improve accuracy in identifying fraudulent companies from regulatory texts.
Contribution
It introduces a comprehensive comparison of neural network models for NLP-based financial fraud detection, highlighting their effectiveness in analyzing regulatory and financial reports.
Findings
LSTM and GRU outperform other models in accuracy
Deep learning models significantly improve fraud detection capabilities
The approach offers valuable insights for industry and regulators
Abstract
In this report, I present a deep learning approach to conduct a natural language processing (hereafter NLP) binary classification task for analyzing financial-fraud texts. First, I searched for regulatory announcements and enforcement bulletins from HKEX news to define fraudulent companies and to extract their MD&A reports before I organized the sentences from the reports with labels and reporting time. My methodology involved different kinds of neural network models, including Multilayer Perceptrons with Embedding layers, vanilla Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), and Gated Recurrent Unit (GRU) for the text classification task. By utilizing this diverse set of models, I aim to perform a comprehensive comparison of their accuracy in detecting financial fraud. My results bring significant implications for financial fraud detection as this work contributes to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques
