LAMDA: A Longitudinal Android Malware Benchmark for Concept Drift Analysis

Md Ahsanul Haque; Ismail Hossain; Md Mahmuduzzaman Kamol; Md Jahangir Alam; Suresh Kumar Amalapuram; Sajedul Talukder; Mohammad Saidur Rahman

arXiv:2505.18551·cs.CR·May 27, 2025

LAMDA: A Longitudinal Android Malware Benchmark for Concept Drift Analysis

Md Ahsanul Haque, Ismail Hossain, Md Mahmuduzzaman Kamol, Md Jahangir Alam, Suresh Kumar Amalapuram, Sajedul Talukder, Mohammad Saidur Rahman

PDF

Open Access 1 Repo 3 Datasets 3 Reviews

TL;DR

LAMDA is a comprehensive, long-term Android malware dataset designed to analyze concept drift, enabling researchers to evaluate how malware detection models' performance degrades over time due to evolving malware and benign applications.

Contribution

This paper introduces LAMDA, the largest and most temporally diverse Android malware benchmark specifically created for concept drift analysis in malware detection.

Findings

01

Standard ML models' performance degrades over time on LAMDA

02

Feature stability varies across different malware families

03

LAMDA enables detailed study of malware evolution and detection challenges

Abstract

Machine learning (ML)-based malware detection systems often fail to account for the dynamic nature of real-world training and test data distributions. In practice, these distributions evolve due to frequent changes in the Android ecosystem, adversarial development of new malware families, and the continuous emergence of both benign and malicious applications. Prior studies have shown that such concept drift -- distributional shifts in benign and malicious samples, leads to significant degradation in detection performance over time. Despite the practical importance of this issue, existing datasets are often outdated and limited in temporal scope, diversity of malware families, and sample scale, making them insufficient for the systematic evaluation of concept drift in malware detection. To address this gap, we present LAMDA, the largest and most temporally diverse Android malware…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

The paper is well-organized. The research topic is significant. The experiments are sufficient.

Weaknesses

It lacks a clear comparison with relevant datasets. It lacks specific guidance for future work.

Reviewer 02Rating 6Confidence 2

Strengths

This work is in an area that is now not near my current area of research, thus my lower confidence score. The dataset is large and to the best of my knowledge the longest longitudinal malware dataset collected to date. The analysis is very thorough.

Weaknesses

See questions

Reviewer 03Rating 8Confidence 5

Strengths

 Scale and Temporal Scope: Over 1 million samples across 12 years with 1,380 families and 150K singleton samples provide unprecedented temporal coverage and diversity for Android malware research. This addresses a genuine gap in existing datasets.  Comprehensive Drift Analysis: The multi-faceted approach (supervised learning degradation, feature distribution shifts via Jeffreys divergence, feature stability scores, SHAP-based explanation drift, label drift) provides rich evidence for concept

Weaknesses

Unclear Scan Consistency： The paper does not specify whether VirusTotal labels were obtained from single-pass or repeated scans. Since detection outcomes can vary across rescans, this ambiguity may introduce label inconsistency. Lack of Intra-Sample Drift Analysis： The study analyzes global and family-level drift but does not consider intra-sample temporal variation—how the same APK’s features might change across time. Such analysis could better capture longitudinal behavior shifts. Static

Code & Models

Repositories

iqsec-lab/lamda
pytorchOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Caching and Content Delivery · Peer-to-Peer Network Technologies