Comparing Clustering Approaches for Smart Meter Time Series: Investigating the Influence of Dataset Properties on Performance
Luke W. Yerbury, Ricardo J. G. B. Campello, G. C. Livingston Jr, Mark, Goldsworthy, Lachlan O'Neil

TL;DR
This study systematically compares various clustering methods for smart meter time series data, emphasizing dataset properties' effects on performance and identifying robust approaches like Dynamic Time Warping and $k$-sliding distance.
Contribution
It introduces a comprehensive framework for evaluating clustering methods on synthetic datasets tailored to smart meter data, considering dataset variability and robustness.
Findings
DTW and $k$-sliding outperform traditional methods
Robust clustering combinations identified with $k$-medoids and hierarchical clustering
Dataset properties significantly influence clustering performance
Abstract
The widespread adoption of smart meters for monitoring energy consumption has generated vast quantities of high-resolution time series data which remains underutilised. While clustering has emerged as a fundamental tool for mining smart meter time series (SMTS) data, selecting appropriate clustering methods remains challenging despite numerous comparative studies. These studies often rely on problematic methodologies and consider a limited scope of methods, frequently overlooking compelling methods from the broader time series clustering literature. Consequently, they struggle to provide dependable guidance for practitioners designing their own clustering approaches. This paper presents a comprehensive comparative framework for SMTS clustering methods using expert-informed synthetic datasets that emphasise peak consumption behaviours as fundamental cluster concepts. Using a phased…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Energy Load and Power Forecasting · Data Stream Mining Techniques
