A Data-driven feature selection and machine-learning model benchmark for the prediction of longitudinal dispersion coefficient
Yifeng Zhao, Pei Zhang, S.A. Galindo-Torres, Stan Z. Li

TL;DR
This study benchmarks machine learning models for predicting the longitudinal dispersion coefficient in streams, emphasizing feature selection, model comparison, and identifying key parameters like channel slope for improved accuracy.
Contribution
It introduces a data-driven feature selection method and provides a comprehensive benchmark of ML models for LD prediction, highlighting the importance of proper feature choice and model selection.
Findings
Support vector machine outperforms other models.
Simple linear models are more effective than complex ones in this context.
Channel slope is a key parameter for accurate LD prediction.
Abstract
Longitudinal Dispersion(LD) is the dominant process of scalar transport in natural streams. An accurate prediction on LD coefficient(Dl) can produce a performance leap in related simulation. The emerging machine learning(ML) techniques provide a self-adaptive tool for this problem. However, most of the existing studies utilize an unproved quaternion feature set, obtained through simple theoretical deduction. Few studies have put attention on its reliability and rationality. Besides, due to the lack of comparative comparison, the proper choice of ML models in different scenarios still remains unknown. In this study, the Feature Gradient selector was first adopted to distill the local optimal feature sets directly from multivariable data. Then, a global optimal feature set (the channel width, the flow velocity, the channel slope and the cross sectional area) was proposed through numerical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting · Advanced Multi-Objective Optimization Algorithms · Flow Measurement and Analysis
