MongoDB Injection Query Classification Model using MongoDB Log files as Training Data

Shaunak Perni; Minal Shirodkar; Ramdas Karmalli

arXiv:2601.11996·cs.CR·January 21, 2026

MongoDB Injection Query Classification Model using MongoDB Log files as Training Data

Shaunak Perni, Minal Shirodkar, Ramdas Karmalli

PDF

Open Access

TL;DR

This paper develops a machine learning-based model to classify MongoDB injection attacks using log data and features extracted from attack logs, achieving 71% accuracy, to improve detection over traditional rule-based systems.

Contribution

It introduces a novel approach using log data and feature extraction for classifying NoSQL injection attacks, enhancing detection accuracy with AutoML models.

Findings

01

Best model achieved 71% accuracy

02

Log data features improve attack detection

03

AutoML outperforms manual models

Abstract

NoSQL Injection attacks are a class of cybersecurity attacks where an attacker sends a specifically engineered query to a NoSQL database which then performs an unauthorized operation. To defend against such attacks, rule based systems were initially developed but then were found to be ineffective to innovative injection attacks hence a model based approach was developed. Most model based detection systems, during testing gave exponentially positive results but were trained only on the query statement sent to the server. However due to the scarcity of data and class imbalances these model based systems were found to be not effective against all attacks in the real world. This paper explores classifying NoSQL injection attacks sent to a MongoDB server based on Log Data, and other extracted features excluding raw query statements. The log data was collected from a simulated attack on an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Network Security and Intrusion Detection · Software System Performance and Reliability