Lucas Pereira, Donovan Costa, Miguel Ribeiro
Scientific Data 9, 134 (2022); doi: 10.1038/s41597-022-01252-2
Publication year: 2022

Abstract

Smart meter data is a cornerstone for the realization of next-generation electrical power grids by enabling the creation of novel energy data-based services like providing recommendations on how to save energy or predictive maintenance of electric appliances. Most of these services are developed on top of advanced machine-learning algorithms, which rely heavily on datasets for training, testing, and validation purposes. A limitation of most existing datasets, however, is the scarcity of labels. The SustDataED2 dataset described in this paper contains 96 days of aggregated and individual appliance consumption from one household in Portugal. The current and voltage waveforms were sampled at 12.8 kHz, and the individual consumption of 18 appliances was sampled at 0.5 Hz. The dataset also contains the timestamps of the ON-OFF transitions of the monitored appliances for the entire deployment duration, providing the necessary ground truth for the evaluation of machine learning problems, particularly Non-Intrusive Load Monitoring. The data is accessible in easy-to-use audio and comma-separated formats.