Data-Driven Efficiency: Predicting Energy Consumption with Machine Learning
If you like the article, you can support me !
Don't forget to drop a ❤️ and share it with someone who maybe interested.
In the previous post of this series we explored the critical role buildings play in our climate efforts and highlighted various scenarios where AI can make a significant contribution. In this article, we delve deeper into the technical aspects, starting with the fundamentals of energy consumption prediction. We then move on to examine a selection of machine learning algorithms widely-used in the field, analyzing their advantages and limitations.
Introduction to Energy Consumption Modeling
Computational models employed for energy consumption prediction are generally classified into three distinct categories: white-box, grey-box, and black-box models.
White-box models are grounded in established physical or process equations, making them completely interpretable. They are extensively used when a system's underlying physics is well-understood. Examples of white-box models are building energy simulation software like EnergyPlus or TRNSYS.
On the opposite end of the spectrum, black-box models are data-driven and do not depend on explicit, predetermined equations. Instead, they learn from the data itself, making them highly adaptable and suited for modelling complex and non-linear systems.
Balancing these two extremes are grey-box models, such as state-space models. These models merge both data-driven and equation-based elements, proving useful in situations where a partial understanding of system dynamics is available.
In this series, our main emphasis will be on black-box models, and particularly Machine Learning models. These models are inherently data-driven and intentionally avoid incorporating physical equations to forecast building consumption. Their strength lies in managing complex, non-linear relationships among variables within the data. Given high-quality, highly granular training data, they are capable of quickly delivering very accurate predictions. This efficiency is even more evident when dealing with complex systems that would traditionally necessitate extensive time and engineering knowledge to model using white-box or grey-box approaches.
There are two main classes of machine learning models: supervised learning and unsupervised learning models1.
Supervised learning: In this approach the model is trained using a dataset where each piece of input data is matched with a correct output (label). The model refines its predictions by assessing its outputs against these known labels and adjusting its internal parameters to improve accuracy. A typical example in the energy field might be predicting house energy labels. The model is trained on a dataset with features like house size, location, and age, and the corresponding energy label. It then learns to predict the energy label for new houses, based on these features.
Unsupervised learning: The model is trained using a dataset that doesn't include any labels or predefined outputs. The model autonomously identifies patterns and relationships within the data. It's utilized to infer insights from datasets that consist solely of input data without any associated labels or responses. In the energy field, for instance, unsupervised learning can analyze electricity usage data to uncover distinct consumption patterns, such as identifying which buildings or users consume more energy during peak hours, without prior categorization.
Energy consumption prediction belongs to the supervised learning subset: a model is constructed to forecast an output Y (energy consumption, also called dependent variable) based on one or more inputs X (conditions affecting energy usage, also called independent variables). The model, once built, generally serves two core purposes:
Prediction: in this case the goal is to predict Y, given a certain set of inputs X. The main focus is to optimize the accuracy of the predictions.
Inference: In this case, the interest is shifted more towards understanding the relationship between Y and changes in X, rather than just predicting Y. The main concern is the interpretability of the model rather than the accuracy of the predictions.
When applying this framework to energy consumption prediction, the dependent variable Y is energy consumption data, while the independent variables usually include elements like weather data, calendar features, and sensor readings from within the buildings.
Once the purpose that the model should serve is clear, it’s time to choose a concrete algorithm for our analysis.
Choosing an algorithm
Depending on the final objective of the analysis, a wide range of data-driven algorithms can be used to model energy consumption, each with its own strengths and weaknesses. The choice of algorithm often involves a trade-off between model interpretability and prediction accuracy, although occasionally time constraints and computational limitations also have to be considered. Five very popular data-driven models frequently used to predict energy consumption are presented here, ranked from highest to lowest interpretability:
Linear Regression Models: These are the simplest and most interpretable models, based on a linear relationship between the input (independent variables like weather, time of day, etc.) and output (energy consumption).
Strengths: Highly interpretable, easy to understand, low computational resources required.
Weaknesses: Limited prediction accuracy due to linearity assumption, often fails to capture complex relationships in the data.
Generalized Additive Models (GAMs): GAMs further extend linear models by allowing non-linear relationships between the features and the response variable, using a sum of smooth functions instead of a linear combination of features.
Strengths: High interpretability, able to capture non-linear relationships, more flexible than linear regression models.
Weaknesses: Less interpretable than linear regression, may not perform as well as models like neural networks or ensemble methods for highly complex datasets.
Decision Trees and Random Forests: These models use a tree-like graph to model decisions based on various input parameters.
Strengths: Good interpretability (especially for individual trees), capable of handling both categorical and numerical data, and not sensitive to outliers.
Weaknesses: Single decision trees may suffer from overfitting; although Random Forests address this, they sacrifice some interpretability because they average many individual tree predictions.
Gradient Boosting Machines (GBMs): GBMs are an ensemble method that builds decision trees sequentially, each new tree aiming to correct the errors of the previous one.
Strengths: Often provide superior predictive accuracy, can handle different types of data, able to capture complex, non-linear relationships.
Weaknesses: More prone to overfitting than Random Forests if not properly tuned, less interpretable due to sequential tree building, require significant computational resources.
Artificial Neural Networks (ANNs): ANNs, including deep learning models, are highly complex and can model non-linear relationships between inputs and outputs.
Strengths: Able to capture complex, non-linear relationships, perform well on large datasets.
Weaknesses: Least interpretable due to its black-box nature, require significant computational resources, prone to overfitting if not properly regularized.
This wraps up our second post on predicting energy consumption with machine learning! In our upcoming issue, we will revisit real-world applications and examine them through the lens of the technical insights we've gained.
For a comprehensive introduction to the fundamental concepts of statistical learning, I recommend An Introduction to Statistical Learning (book and online course).