Distinguishing Scams and Fraud with Ensemble Learning

Isha Chadalavada; Tianhui Huang; Jessica Staddon

arXiv:2412.08680·cs.CR·December 13, 2024·2 cites

Distinguishing Scams and Fraud with Ensemble Learning

Isha Chadalavada, Tianhui Huang, Jessica Staddon

PDF

Open Access

TL;DR

This paper introduces an ensemble learning method using large language models to differentiate scam complaints from non-scam fraud in the CFPB database, aiming to improve scam detection accuracy.

Contribution

It presents a novel LLM ensemble approach specifically designed for distinguishing scam complaints from other fraud reports in a financial complaints dataset.

Findings

01

Ensemble LLMs outperform individual models in scam detection.

02

Identifies strengths and weaknesses of LLMs in scam classification.

03

Provides initial evaluation results on CFPB complaints data.

Abstract

Users increasingly query LLM-enabled web chatbots for help with scam defense. The Consumer Financial Protection Bureau's complaints database is a rich data source for evaluating LLM performance on user scam queries, but currently the corpus does not distinguish between scam and non-scam fraud. We developed an LLM ensemble approach to distinguishing scam and fraud CFPB complaints and describe initial findings regarding the strengths and weaknesses of LLMs in the scam defense context.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques