# Development and comparative analysis of machine learning algorithms for predictive atmospheric corrosion modeling

**Authors:** Jose Manuel Perales Fernández, María López Abelairas, Arturo Sánchez-Ramos, Lila Otero-Gonzalez, Leonardo Bertolucci Coelho, Arturo Sánchez-Ramos, Atwakyire Moses, Arturo Sánchez-Ramos

PMC · DOI: 10.12688/openreseurope.19770.1 · Open Research Europe · 2025-03-27

## TL;DR

This paper uses machine learning to predict atmospheric corrosion rates, showing that ensemble methods like random forests are most effective.

## Contribution

The study introduces a comprehensive dataset and demonstrates that ensemble machine learning models significantly improve corrosion rate predictions.

## Key findings

- Ensemble methods, particularly random forests, outperformed other models in predicting corrosion rates.
- Feature engineering and hyperparameter tuning enhanced model accuracy and efficiency.
- A standardized dataset from diverse environments was crucial for effective corrosion modeling.

## Abstract

Industrial content and infrastructure are in constant danger from atmospheric corrosion, which affects economies globally. However, there is a lack of a consistent set of comprehensive data that completely surrounds the range of this problem in diverse climate and locations. The purpose of the research is to evaluate the factors that contribute to atmospheric corrosion and its diverse effects on materials in various environments.

By creating a comprehensive dataset by collecting and standardizing corrosion data from diverse environments and geographic regions and initially analyzing the data, it helped indicate the main parameters affecting corrosion. This guided the selection of future features for further modeling. Several machine learning algorithms were tested, such as linear regression, decisions tree, neural network, and, most especially, attempting methods, for their corrosion rate prediction capabilities. These models were assessed based on their prediction’s accuracy, and computational efficiency, with special attention to refining their performance through detailed feature engineering and hyperparameter adjustment.

Upon evaluating the performance of conventional predictive models, the research indicated that the machine learning approaches, especially with random forests methods of dress, were excellent in predicting corrosion rates, significantly improved upon these capabilities. By analyzing various machine learning approaches, it became clear that it was important to enhance their accuracy by selecting the best features and customizing them.

This work represents a significant advancement in the predictive modeling of atmospheric corrosion. It highlights the invaluable role of machine learning in this field. By integrating varied data sets and applying sophisticated machine learning techniques, it has established a foundation for ongoing research and the practical application of corrosion management strategies. The exceptional performance of ensemble methods, like random forests, signals their potential to improve prediction capabilities and guide more effective corrosion prevention measures.

This paper addresses the issue of atmospheric corrosion, a naturally occurring phenomena that gradually deteriorates materials exposed to air over time, jeopardising industrial facilities, bridges, and buildings. One of the most significant obstacles in capturing and forecasting this type of corrosion is a lack of structured data. We collected and standardised corrosion data from several locations and environmental scenarios to create a more comprehensive and meaningful dataset to help with this.

This knowledge enabled us to study which elements contribute the most to corrosion and influence our selection of relevant data for constructing prediction models. Following that, we explored with a variety of machine learning algorithms (computer-based methods for detecting patterns in data), ranging from simple linear regression to more complicated decision trees and neural networks. Following a comparison of their accuracy and efficiency, we noticed that some approaches, particularly those that contain multiple models known as "ensemble methods," such as random forests, performed best in predicting how quickly materials would corrode.

We also found that improving these models by carefully refining the proper data inputs and changing the settings of the model allows them to provide more accurate predictions. This study not only reveals which machine learning algorithms are best suited for corrosion prediction, but it also lays the framework for better monitoring and protection of materials exposed to the atmosphere. This work promotes the overall objective of minimising material degradation and ensuring the longevity of critical infrastructure by combining data from numerous sources and employing a variety of analytical techniques.

## Full-text entities

- **Diseases:** CR (MESH:C536766)
- **Chemicals:** SO 2 (MESH:D013458), oxygen (MESH:D010100), chloride (MESH:D002712), T (MESH:D014316), Magnesium (MESH:D008274), CR (-), steel (MESH:D013232), Metal (MESH:D008670)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12869018/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC12869018/full.md

---
Source: https://tomesphere.com/paper/PMC12869018