Big Data Workload Profiling for Energy-Aware Cloud Resource Management
Milan Parikh, Aniket Abhishek Soni, Sneja Mitinbhai Shah, Ayush Raj Jha

TL;DR
This paper introduces an energy-aware scheduling framework for cloud data centers that uses workload profiling to optimize virtual machine placement, achieving significant energy savings with minimal performance impact.
Contribution
It presents a novel workload profiling approach combining historical logs and real-time telemetry to guide energy-efficient VM placement in cloud environments.
Findings
Achieves 15-20% energy savings over baseline schedulers
Maintains service level agreements with negligible performance degradation
Demonstrates effectiveness across Hadoop, Spark, and ETL workloads
Abstract
Cloud data centers face increasing pressure to reduce operational energy consumption as big data workloads continue to grow in scale and complexity. This paper presents a workload aware and energy efficient scheduling framework that profiles CPU utilization, memory demand, and storage IO behavior to guide virtual machine placement decisions. By combining historical execution logs with real time telemetry, the proposed system predicts the energy and performance impact of candidate placements and enables adaptive consolidation while preserving service level agreement compliance. The framework is evaluated using representative Hadoop MapReduce, Spark MLlib, and ETL workloads deployed on a multi node cloud testbed. Experimental results demonstrate consistent energy savings of 15 to 20 percent compared to a baseline scheduler, with negligible performance degradation. These findings highlight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Big Data and Digital Economy · Distributed and Parallel Computing Systems
