Improve Machine Learning carbon footprint using Parquet dataset format and Mixed Precision training for regression models -- Part II
Andrew Antonopoulos

TL;DR
This study compares power consumption of regression machine learning models using CSV and Parquet data formats with mixed precision training, finding that hyper-parameter tuning and data format choices have limited impact on overall power savings.
Contribution
It provides empirical evidence on how mixed precision and data formats affect power consumption in regression models, highlighting the importance of hyper-parameter optimization.
Findings
Mixed precision training reduces power consumption by 7-11 Watts.
Hyper-parameters like batch size and neurons influence power use negatively.
No statistically significant difference in power consumption was found between data formats.
Abstract
This is the 2nd part of the dissertation for my master degree and compared the power consumption using the Comma-Separated-Values (CSV) and parquet dataset format with the default floating point (32bit) and Nvidia mixed precision (16bit and 32bit) while training a regression ML model. The same custom PC as per the 1st part, which was dedicated to the classification testing and analysis, was built to perform the experiments, and different ML hyper-parameters, such as batch size, neurons, and epochs, were chosen to build Deep Neural Networks (DNN). A benchmarking test with default hyper-parameter values for the DNN was used as a reference, while the experiments used a combination of different settings. The results were recorded in Excel, and descriptive statistics were chosen to calculate the mean between the groups and compare them using graphs and tables. The outcome was positive when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting
