Large-Scale Detection of Non-Technical Losses in Imbalanced Data Sets

Patrick O. Glauner; Andre Boechat; Lautaro Dolberg; Radu State; Franck; Bettinger; Yves Rangoni; Diogo Duarte

arXiv:1602.08350·cs.LG·July 26, 2017

Large-Scale Detection of Non-Technical Losses in Imbalanced Data Sets

Patrick O. Glauner, Andre Boechat, Lautaro Dolberg, Radu State, Franck, Bettinger, Yves Rangoni, Diogo Duarte

PDF

TL;DR

This paper evaluates three machine learning models for detecting non-technical losses in large, imbalanced electricity customer datasets, emphasizing real-world deployment challenges and model effectiveness.

Contribution

It provides a comprehensive assessment of Boolean rules, fuzzy logic, and SVM models for NTL detection in large, imbalanced datasets, addressing deployment and data imbalance issues.

Findings

01

Support Vector Machine outperforms other models in detection accuracy.

02

Models show robustness across varying NTL proportions.

03

Approach is ready for deployment in industry solutions.

Abstract

Non-technical losses (NTL) such as electricity theft cause significant harm to our economies, as in some countries they may range up to 40% of the total electricity distributed. Detecting NTLs requires costly on-site inspections. Accurate prediction of NTLs for customers using machine learning is therefore crucial. To date, related research largely ignore that the two classes of regular and non-regular customers are highly imbalanced, that NTL proportions may change and mostly consider small data sets, often not allowing to deploy the results in production. In this paper, we present a comprehensive approach to assess three NTL detection models for different NTL proportions in large real world data sets of 100Ks of customers: Boolean rules, fuzzy logic and Support Vector Machine. This work has resulted in appreciable results that are about to be deployed in a leading industry solution.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.