Dealing with zero-inflated data: achieving SOTA with a two-fold machine learning approach
Jo\v{z}e M. Ro\v{z}anec, Ga\v{s}per Petelin, Jo\~ao Costa, Bla\v{z}, Bertalani\v{c}, Gregor Cerar, Marko Gu\v{c}ek, Gregor Papa, Dunja Mladeni\'c

TL;DR
This paper introduces a two-fold machine learning approach to effectively handle zero-inflated data, significantly improving prediction accuracy and energy efficiency in real-world applications like appliance classification and shuttle demand forecasting.
Contribution
The paper presents a novel hierarchical two-fold modeling technique specifically designed for zero-inflated data, achieving state-of-the-art results and enhanced energy efficiency.
Findings
Significant improvements in Precision, Recall, F1, and AUC ROC metrics.
Four times more energy efficient than previous state-of-the-art methods.
Statistically significant performance gains across all tested cases.
Abstract
In many cases, a machine learning model must learn to correctly predict a few data points with particular values of interest in a broader range of data where many target values are zero. Zero-inflated data can be found in diverse scenarios, such as lumpy and intermittent demands, power consumption for home appliances being turned on and off, impurities measurement in distillation processes, and even airport shuttle demand prediction. The presence of zeroes affects the models' learning and may result in poor performance. Furthermore, zeroes also distort the metrics used to compute the model's prediction quality. This paper showcases two real-world use cases (home appliances classification and airport shuttle demand prediction) where a hierarchical model applied in the context of zero-inflated data leads to excellent results. In particular, for home appliances classification, the weighted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnergy Load and Power Forecasting · Energy, Environment, and Transportation Policies · Air Quality Monitoring and Forecasting
