# Impact of sample size on optimisation algorithms for the MLP used in the prediction of client subscription to a term deposit

**Authors:** Tshegofatso Botlhoko, Tlhalitshi Volition Montshiwa, Mohsen Mohammadagha, Tlhalitshi Montshiwa

PMC · DOI: 10.12688/f1000research.168092.1 · F1000Research · 2025-12-22

## TL;DR

This study compares optimization algorithms for MLP in predicting bank term deposit subscriptions and finds CMA-ES-MLP performs best across various sample sizes.

## Contribution

The novel contribution is identifying CMA-ES as the optimal optimization algorithm for MLP in this context.

## Key findings

- CMA-ES-MLP achieved the highest classification metrics and second-fastest training time.
- Sample size variations did not consistently affect classification performance.
- CMA-ES-MLP with a sample size of 5,114 outperformed previous classifiers on the same dataset.

## Abstract

One of the disadvantages of the multilayer perception (MLP), which is a machine learning (ML) algorithm used in various fields, includes the uncontrollable growth of the number of total parameters, which may make MLP redundant in such high dimensions, and the uncontrollable growing stack of layers that ignores spatial information. Optimization algorithms were developed to determine the optimum number of parameters for MLP.

In this paper, the performances of the Genetic Algorithm (GA), Grasshopper Optimization Algorithm (GOA), and Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are compared. The study also sought to determine the impact of sample size variations on these optimization algorithms. A dataset on the direct marketing campaigns of a Portuguese banking institution from the UCI Machine Learning Repository with a sample size of 4 521 was used. Synthetic Minority Oversampling Technique (SMOTE) was applied to balance the binary dependent variables for the training data across various sample sizes.

Based on the classification accuracy, specificity, sensitivity, precision, F-score, and execution time, the MLP based on CMA-ES (CMA-ES-MLP) was identified as the best classifier overall, as it maintained high rates of these classification metrics and was the second fastest to train. CMA-ES-MLP with a training sample of 5 114 was our ideal classifier, and it competes well with the classifiers that have been built by previous studies that used the same dataset.

The study found no consistent increase or decrease in the classification performance of the algorithms as the sample size increased, and the metrics fluctuated rapidly across sample sizes. It is recommended that future studies be conducted to compare the best-performing classifiers identified in previous studies with the CMA-ES-MLP in this study under the same experimental conditions.

## Full-text entities

- **Genes:** MARCKSL1 (MARCKS like 1) [NCBI Gene 65108] {aka F52, MACMARCKS, MLP, MLP1, MRP}
- **Chemicals:** CMA (MESH:D002715), GA (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Caelifera (grasshoppers, groundhoppers & pygmy mole crickets, suborder) [taxon 7001]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12910200/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12910200/full.md

## References

94 references — full list in the complete paper: https://tomesphere.com/paper/PMC12910200/full.md

---
Source: https://tomesphere.com/paper/PMC12910200