# Optimization of indirect wastewater characterization using led spectrophotometry: a comparative analysis of regression, scaling, and dimensionality reduction methods

**Authors:** Daniel Carreres-Prieto, Enrique Fernandez-Blanco, Daniel Rivero, Juan R. Rabuñal, Jose Anta, Juan T. García

PMC · DOI: 10.1007/s11356-024-34714-8 · Environmental Science and Pollution Research International · 2024-08-28

## TL;DR

This paper compares methods to optimize wastewater pollutant analysis using LED spectrophotometry, finding the best combination of data processing techniques.

## Contribution

The study introduces a systematic comparison of 84 data processing pipelines for wastewater characterization using LED spectrophotometry.

## Key findings

- Normalization followed by PCA and Random Forest Regressor provided the best performance for predicting COD and TSS.
- Eighty-four pipelines were tested using 7 regression techniques, 3 scaling methods, and 4 dimensionality reductions.
- Cross-validation on 15 sub-datasets showed the importance of data preprocessing in LED spectrophotometry for wastewater analysis.

## Abstract

LED spectrophotometry is a robust technique for the indirect characterization of wastewater pollutant load through correlation modeling. To tackle this issue, a dataset with 1300 samples was collected, from both raw and treated wastewater from 45 wastewater treatment plants in Spain and Chile collected over 4 years. The type of regressor, scaling, and dimensionality reduction technique and nature of the data play crucial roles in the performance of the processing pipeline. Eighty-four pipelines were tested through exhaustive experimentation resulting from the combination of 7 regression techniques, 3 scaling methods, and 4 possible dimensional reductions. Those combinations were tested on the prediction of chemical oxygen demand (COD) and total suspended solids (TSS). Each pipeline underwent a tenfold cross-validation on 15 sub-datasets derived from the original dataset, accounting for variations in plants and wastewater types. The results point to the normalization of the data followed by a conversion through the PCA to finally apply a Random Forest Regressor as the combination which stood out These results highlight the importance of modeling strategies in wastewater management using techniques such as LED spectrophotometry.

The online version contains supplementary material available at 10.1007/s11356-024-34714-8.

## Full-text entities

- **Chemicals:** TSS (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11413097/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11413097/full.md

## References

2 references — full list in the complete paper: https://tomesphere.com/paper/PMC11413097/full.md

---
Source: https://tomesphere.com/paper/PMC11413097