0% found this document useful (0 votes)
110 views45 pages

Final

This document discusses solar energy forecasting using machine learning and artificial intelligence. It provides background on solar power and how variability in solar output needs to be predicted accurately. The document then reviews literature on applying machine learning algorithms like neural networks to predict solar energy production using different data sources and evaluates various models.

Uploaded by

tanu06321
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views45 pages

Final

This document discusses solar energy forecasting using machine learning and artificial intelligence. It provides background on solar power and how variability in solar output needs to be predicted accurately. The document then reviews literature on applying machine learning algorithms like neural networks to predict solar energy production using different data sources and evaluates various models.

Uploaded by

tanu06321
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

MAULANA AZAD NATIONAL INSTITUTE OF

TECHNOLOGY, BHOPAL

Solar Forecasting using Artificial Intelligence

BY

Abhishek Katiyar 201116077


Sheersh Jain 201116020
Tanisha Jain 201116054

Under the Supervision of


Dr. Alok Singh
Submitted to the Department of Mechanical Engineering
in partial fulfilment of the requirement
for degree of
Bachelor of Technology
In
Mechanical Engineering
CERTIFICATE

This is to certify that Project Report entitled "Solar Forecasting using Artificial
Intelligence" which is submitted by Abhishek Katiyar, Sheersh Jain, Tanisha Jain
in partial fulfilment of the requirement for the award of degree B. Tech. in
Department of Mechanical Engineering of Maulana Azad National Institute of
Technology Bhopal, is a record of the candidates own work carried out by them
under my supervision, Dr Alok Singh. The matter embodied in this thesis is
original and has not been submitted for the award of any other degree.

Supervisor:
Date:
DECLARATION

We hereby declare that this submission is our own work and that, to the best of
our knowledge and belief, it contains no material previously published or written
by another person nor material which to a substantial extent has been accepted for
the award of any other degree or diploma of the university or other institute of
higher learning, except where due acknowledgment has been made in the text.

Abhishek Katiyar 201116077


Sheersh Jain 201116020
Tanisha Jain 201116054
ACKNOWLEDGEMENT

It gives us a great sense of pleasure to present the report of the B. Tech Major
Project undertaken during B. Tech. Final Year. We owe special debt of gratitude
to Professor Dr. Alok Singh, Department of Mechanical Engineering, Maulana
Azad National Institute of Technology, for his constant support and guidance
throughout the course of our work. His sincerity, thoroughness and perseverance
have been a constant source of inspiration for us. It is only his cognizant efforts
that our endeavours have seen light of the day.
ABSTRACT

This comprehensive literature survey navigates the dynamic landscape of machine


learning and deep learning models within the realm of solar energy forecasting.
Delving into 15 papers published between 2021-2023, the survey meticulously
explores the evolution of these models. Artificial neural networks (ANN), long
short-term memory (LSTM), and convolutional neural networks (CNN) take
center stage in the majority of studies, showcasing their prowess in accurate
predictions. The report highlights the strategic use of ensemble models like
stacking and bagging to elevate forecasting precision. Additionally, feature
selection techniques and autoencoders emerge as key players in reducing data
dimensionality and enhancing prediction accuracy.

What sets this survey apart is its emphasis on the nuts and bolts of neural networks,
providing readers with a deep dive into the calculations and workings of these
models. The report meticulously dissects the intricacies of the neural network
mechanisms employed in the surveyed papers. Data availability emerges as a
crucial theme, with studies drawing from diverse sources such as meteorological
agencies (NASA, NOAA, ECMWF), satellite images, and sky cameras to bolster
precision. Notably, some research ventures into the realm of real-time data
through the utilization of IoT devices and sensors, underscoring the pursuit of
heightened accuracy in solar energy forecasting.

In addition to the comprehensive literature review, we implemented various


machine learning models and selected the one with the highest accuracy. As part
of our innovation, we developed a user-friendly website where users can make
predictions using our selected model. This website serves as a practical application
of our research findings, providing a seamless interface for solar energy
forecasting.
TABLE OF CONTENTS

1. Introduction
2. Solar Energy Output Prediction – Need for the Study
3. Literature Review
4. Objectives
5. Methodology
6. Results and Discussions
7. Website Interface
8. Conclusions
9. References
CHAPTER – 1
INTRODUCTION

Undoubtedly, the renewable energy revolution stands as a pivotal endeavour,


promising substantial environmental and societal benefits. Among renewable
resources, sunlight emerges as a frontrunner due to its abundance, accessibility,
and sustainability, positioning solar energy as a compelling alternative to
conventional sources. Accurate forecast data becomes imperative for optimizing
energy utilization, system planning, and ensuring the equilibrium and stability of
solar energy systems. In this study, the employed method surpasses current quality
standards, enhancing forecast accuracy.

While the trajectory and energy of the sun can be computed using physical laws,
predicting the nuanced development and production of solar energy necessitates
a blend of physical simulation and artificial intelligence. This complexity arises
from various variables, including sun position, climate, photovoltaic panel
properties, among others, influencing the amount of generated solar energy.
Precision in solar forecasting proves paramount for maintaining the balance,
stability, and planning of energy production within the power grid.

Renewable energy sources, concerning grid operations, are classified into


dispatchable (geothermal, solar with storage, biomass, hydro) and non-
dispatchable (solar cells, windmills, ocean currents). Operating plans hinge on
accurate forecasts for various time periods. This study focuses on the prediction
of solar energy output using machine learning methods. To enhance precision, a
suggested hybrid fusion model amalgamates the strengths of diverse machine
learning algorithms. The model incorporates multiple input factors, such as
weather data, time of day, and location, to forecast solar energy output. This
approach significantly streamlines the implementation of photovoltaic energy
generation and hourly solar radiation forecasting.

1.1 Solar power


Solar power refers to the technology used to harness the sun's energy and convert
it into electricity. Solar power is a clean, renewable source of energy that plays a
crucial role in reducing greenhouse gas emissions and combating climate change.
This is achieved primarily through two methods:
1) Photovoltaic (PV) cells
2) Concentrating solar power (CSP) systems.
Photovoltaic (PV) Cells
PV cells, also known as solar panels, directly convert sunlight into electricity.
They are made of semiconductor materials, such as silicon, which absorb photons
(light particles) and release electrons, generating an electric current.
Solar panels can be installed on rooftops, integrated into building designs (BIPV),
or deployed in large ground-mounted solar farms. The electricity generated by PV
cells can be used immediately, stored in batteries for later use, or fed into the
electricity grid.
Concentrating Solar Power (CSP) Systems
CSP systems use mirrors or lenses to concentrate a large area of sunlight onto a
small area. The concentrated light is then used as heat to generate steam, which
drives a turbine connected to an electric generator. CSP plants are usually built on
a large scale and are more common in areas with a high amount of direct sunlight.
However, the output of solar panels is variable and depends on factors such as
weather conditions and time of day. This variability can make it difficult to
integrate solar power into the electricity grid and manage demand.

Advantages of Solar Power


• Renewable: Solar energy is abundant and inexhaustible. As long as the sun
exists, solar energy can be harnessed anywhere on Earth.
• Reduces Electricity Bills: Solar power can significantly reduce electricity bills
since you can generate your own electricity and even sell surplus power back to
the grid in some cases.
• Low Maintenance Costs: Solar power systems generally require low
maintenance, mainly involving keeping the panels clean and occasional checks by
a technician.
• Technological Advancements: Ongoing research and development are
continually improving the efficiency and reducing the cost of solar panels and
related technology.
• Energy Independence: By investing in solar power, countries, communities, and
individuals can reduce their dependence on imported fuels, enhancing energy
security.

1.2 Artificial Intelligence (AI) and Machine Learning (ML)


The fields of Artificial Intelligence (AI) and Machine Learning (ML) are vast and
interconnected areas of computer science that focus on creating systems capable
of performing tasks that would normally require human intelligence. These tasks
can include recognizing speech, making decisions, translating languages, and
identifying patterns in data. Artificial Intelligence is a broad field of computer
science that aims to create systems or machines that can mimic human intelligence
and perform tasks that typically require human intelligence. AI systems are
designed to perceive their environment, reason about it, and take actions to
achieve specific goals. AI encompasses a wide range of techniques, including
machine learning, natural language processing, computer vision, robotics, and
more.
These processes include learning, reasoning, problem-solving, perception, and
decision-making. AI aims to create systems that can operate autonomously,
understand complex data, and adapt to new situations without human intervention.
Machine Learning is currently one of the most successful approaches to achieving
Artificial Intelligence. By using algorithms that learn from data, machines can make
decisions, predict outcomes, and identify patterns that are too complex for humans
to discern. Instead of being explicitly programmed to perform a certain task, machine
learning algorithms are trained on large datasets to recognize patterns and
relationships within the data. The core idea behind ML is to enable computers to
improve their performance on a specific task over time without being explicitly
programmed for that task.
ML and AI are rapidly evolving fields with broad applications across industries such
as healthcare, finance, transportation, and entertainment. They hold the potential to
transform how we work, live, and interact with technology in the future.

Applications of AI and ML
Healthcare:

 Disease diagnosis and prediction, personalized medicine, medical image analysis.

 Electronic health records analysis for treatment recommendations.


Finance:

 Fraud detection, risk assessment, algorithmic trading, credit scoring.

 Customer segmentation for targeted marketing and personalized services.


E-commerce and Recommendation Systems:

 Product recommendations based on customer behavior and preferences.

 Dynamic pricing, customer churn prediction, and market basket analysis.


Natural Language Processing (NLP):

 Sentiment analysis, language translation, chatbots, speech recognition.

 Named Entity Recognition (NER), text summarization, and language modeling.


Computer Vision:

 Image and video analysis, object detection, facial recognition.

 Autonomous vehicles, medical imaging analysis, surveillance systems.


Manufacturing and Industry 4.0:

 Predictive maintenance, quality control, supply chain optimization.


Smart factories with autonomous systems and robotics.
Environmental Sciences:

 Climate modeling, weather forecasting, environmental monitoring.

 Species identification, deforestation detection, and natural disaster prediction.

Deep Learning(DL)
Deep learning is a subset of ML that uses neural networks with multiple layers to
learn complex representations of data. Deep learning has achieved remarkable
success in areas such as computer vision, natural language processing, and speech
recognition. At its core,deep learning aims to mimic the human brain's neural
networks, allowing machines to learn from vast amounts of data and make intelligent
decisions without explicit programming.
At the heart of deep learning are artificial neural networks, inspired by the structure
and function of the human brain. These networks consist of layers of interconnected
nodes, or "neurons," each performing simple computations. What makes deep
learning "deep" is the presence of multiple layers, allowing the network to learn
complex representations of data.
One of the key strengths of deep learning is its ability to automatically discover
intricate patterns and features in data, often outperforming traditional machine
learning approaches in tasks such as image and speech recognition. This capability
has led to remarkable advancements, such as self-driving cars, personalized
medicine, and natural language understanding in chatbots and virtual assistants.
Deep learning models vary in complexity, from simple feedforward neural networks
to more complex architectures like convolutional neural networks (CNNs) for
images and recurrent neural networks (RNNs) for sequential data. Recently,
attention mechanisms and transformers have gained popularity for tasks requiring
understanding of long-range dependencies, such as language translation.

1.3 Artificial Neural Network(ANN)


An Artificial Neural Network (ANN) is a computational model inspired by the
structure and function of biological neural networks in the human brain. It's a
fundamental component of deep learning and machine learning, capable of learning
complex patterns and relationships from data. ANNs are versatile and have been
applied in diverse fields such as image recognition, natural language processing,
financial forecasting, and healthcare. Their ability to learn from large datasets and
discover complex patterns makes them a powerful tool in the realm of artificial
intelligence and machine learning.

Structure of an Artificial Neural Network:


1.Input Layer: This layer consists of input nodes that receive the initial data. Each
node represents a feature or attribute of the input data.
2. Hidden Layers: Between the input and output layers, there can be one or more
hidden layers. These layers are where the network learns and extracts patterns from
the input data. Each hidden layer consists of multiple neurons (nodes).
3.Output Layer: The final layer of the network produces the output based on the
patterns learned in the hidden layers. The number of nodes in this layer depends on
the type of problem being solved (e.g., binary classification, multi-class
classification, regression).

Neurons (Nodes):
Each neuron in an ANN performs a computation and introduces non-linearity into
the network. The basic structure of a neuron includes:
•Inputs: Each neuron receives inputs from the previous layer or directly from the
input layer.
•Weights: Each input is multiplied by a weight, which determines its significance to
the neuron's output.
•Bias: A bias term is added to the weighted sum of inputs. It allows the neuron to
adjust the output along with the inputs.
•Activation Function: After summing the weighted inputs and bias, an activation
function is applied. This function introduces non-linearity, enabling the network to
learn complex patterns. Common activation functions include ReLU (Rectified
Linear Activation), Sigmoid, and Tanh.

Types of Artificial Neural Networks:


1.Feedforward Neural Networks (FNN): The simplest form where data flows in one
direction, from input to output, through hidden layers.
2.Convolutional Neural Networks (CNN): Designed for image processing, using
convolutional layers to extract features and pooling layers to reduce dimensionality.
3.Recurrent Neural Networks (RNN): Suited for sequential data like time series or
text, with connections between nodes forming loops to remember past information.
4.Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU): Variants of
RNNs with improved ability to handle long-term dependencies.

1.4 Solar Forecasting using Artificial Neural Network


Solar forecasting using Artificial Neural Networks (ANNs) is an important
application in renewable energy management, as accurate solar energy predictions
enable better integration of solar power into the grid and optimize energy generation.
ANNs offer a data-driven approach to forecast solar irradiance, taking into account
various environmental and meteorological factors.
Solar energy, a vital component of renewable energy, holds immense potential as a
clean and sustainable power source. However, the variability of solar irradiance due
to weather conditions poses challenges for its efficient integration into the power
grid. Solar forecasting using Artificial Neural Networks (ANNs) addresses these
challenges by providing accurate predictions of solar irradiance, enabling better grid
management and maximizing the utilization of solar power.

The Need for Solar Forecasting


•Grid Integration: Solar power integration into the grid requires precise forecasts to
balance supply and demand.
•Efficiency: Accurate solar forecasts enhance the efficiency of energy generation and
distribution.
•Cost Reduction: Efficient grid management based on forecasts reduces operational
costs and reliance on backup power sources.

Advantages of ANNs for Solar Forecasting


•Complex Relationships: ANNs can model nonlinear relationships between input
features and solar irradiance.
•Adaptability: They adapt to changing weather patterns and seasonal variations.
•Data-driven: ANNs learn directly from data, making them suitable for dynamic and
complex systems like solar energy production.
CHAPTER – 2

SOLAR ENERGY OUTPUT PREDICTION – NEED


FOR THE STUDY

This study is prompted by the escalating demand for solar energy as a renewable
source, emphasizing the critical need for precise solar radiation prediction to
optimize energy generation and management. Despite the widespread application
of machine learning models in solar radiation prediction, the challenge persists in
selecting the most relevant features for accurate forecasts. Unlike existing research
that predominantly focuses on monthly, daily, or specific period-based solar
radiation prediction, our study uniquely targets hourly predictions during the peak
solar light radiation period from ten in the morning to three in the afternoon.

In the realm of renewable source management, there's a noticeable gap in


addressing hourly solar energy output prediction. This study bridges that gap,
aiming for enhanced solar power energy management and optimization. The
primary objective is to craft a model that precisely predicts solar energy generation
from a PV system, incorporating diverse input factors like weather conditions,
location, and system specifications. The envisioned model strives for reliability,
accuracy, and a minimal margin of error.

1.Grid Integration and Stability:

 Solar energy is an intermittent energy source, highly dependent on weather


conditions like cloud cover and sunlight intensity.

 Accurate solar forecasts help grid operators anticipate fluctuations in solar power
generation, enabling better integration with the grid.

 Grid stability is crucial for maintaining a reliable electricity supply. Forecasts aid
in balancing generation and demand, preventing grid instability and blackouts.

2. Efficient Energy Management:


 Solar forecasts assist in optimizing energy management by predicting when
and how much solar power will be available.

 Grid operators can use these forecasts to schedule other energy sources, such
as natural gas or coal plants, to compensate for variations in solar output.

 Energy storage systems can also be efficiently managed with solar forecasts,
storing excess energy during peak production for later use.

3. Economic Viability and Financial Planning:

 Solar energy projects require significant investments in infrastructure.


Accurate forecasts are crucial for project developers and investors to assess
the economic viability of solar projects.

 Forecasts help in estimating the expected energy production and revenue,


influencing decisions on project size, financing, and profitability.

4. Integration with Renewable Energy Sources:

 Solar energy is often part of a mix of renewable energy sources, such as wind
and hydroelectric power.

 Solar forecasts aid in coordinating the generation from different renewables,


ensuring a balanced and reliable supply of renewable energy.

5. Reducing Carbon Footprint:

 Solar energy is a clean and renewable energy source, reducing greenhouse gas
emissions.

 Accurate forecasts enable utilities and policymakers to maximize the use of


solar energy, reducing the reliance on fossil fuels and mitigating climate
change.

6. Emergency Preparedness:

 During emergencies or natural disasters that affect traditional power sources,


solar energy can serve as a reliable backup.
 Solar forecasts help emergency planners anticipate how much solar energy can
be expected during such events, aiding in emergency response planning.

7. Research and Development:

 Ongoing research in solar forecasting drives innovation in prediction models


and technology.

 Improved forecasting models lead to more efficient solar panels, tracking


systems, and overall advancements in solar energy generation.

The study's approach involves evaluating the performance of various machine


learning models—KNN, LGBM, Random Forest, and DNN—in predicting hourly
solar radiation. Special attention is given to assessing the impact of feature
selection using SMA on the prediction accuracy of these models. Additionally, a
pioneering effort is made to develop a hybrid fusion model that integrates the
slime mould algorithm with machine learning models, aiming for unparalleled
accuracy in solar radiation prediction.

This research strives to contribute to the advancement of solar energy


management by addressing the gap in hourly prediction and introducing
innovative techniques for accurate and efficient solar radiation forecasts.
Machine learning can be used to predict solar power output with high accuracy.
Machine learning models are trained on historical data to learn the relationships
between solar power output and various factors such as weather conditions, time
of day, location, humidity, temperature, inclination angle etc. Once trained,
these models can be used to predict solar power output for future time periods.
C`HAPTER – 3

LITERATURE REVIEW

Mishra, D. P., Jena, S., Senapati, R., Panigrahi, A., & Salkuti, S. R.
(2023), The objective of this study is to forecast global solar radiation using an
ensemble learning approach that combines the predictions of several machine
learning models. The authors aim to develop an accurate and reliable solar
radiation forecasting model that can be used for renewable energy planning and
management. The study proposes a novel approach that incorporates multiple
machine learning models, including artificial neural networks, support vector
machines, and random forests, to achieve better accuracy and reliability than
single-model approaches. The proposed approach is tested and validated using
real-world data, demonstrating its effectiveness in accurately predicting global
solar radiation. [1]

Deo, R. C., Ahmed, A. M., Casillas-Pérez, D., Pourmousavi, S. A.,


Segal, G., Yu, Y., & Salcedo-Sanz, S. (2023), To develop a kernel ridge
regression (KRR) based approach for correcting cloud cover bias in numerical
weather prediction models, which can improve the accuracy of solar energy
monitoring and forecasting systems. The proposed approach uses KRR with
various meteorological input variables and cloud cover as the output variable, and
it is tested on a case study site in Australia. The study aims to demonstrate the
effectiveness of KRR in correcting the cloud cover bias and providing accurate
solar radiation forecasts, which can support the integration of solar energy into
the grid. [2]

You, L., & Zhu, M. (2023), Developing a digital twin simulation and deep
learning framework for predicting solar energy market load using trade-by-trade
data. The paper proposes a novel approach to construct a digital twin simulation
model that mirrors the real-world solar energy market and integrates it with a deep
learning framework to generate accurate predictions of the market load. The
framework is trained and tested on historical trade-by-trade data, and the
performance of the model is evaluated and compared with traditional machine
learning methods.[3]

Krishnan, N., Kumar, K. R., & Inda, C. S. (2023), The objective of this
paper is to critically review the literature on the impact of solar radiation
forecasting on the utilization of solar energy. The authors aim to identify the key
factors that influence the accuracy of solar radiation forecasting and how it
impacts the utilization of solar energy. The paper also examines the various
techniques used for solar radiation forecasting and evaluates their effectiveness.
The review provides insights into the current state of research in the field and
highlights the gaps in knowledge that need to be addressed to improve the
utilization of solar energy. [4]

Immanual, R., Kannan, K., Chokkalingam, B., Priyadharshini, B.,


Sathya, J., Sudharsan, S., & Nath, E. R. (2023), This study aimed to
develop an Artificial Neural Network (ANN) model for predicting the
performance of solar stills, which are used for desalination and water purification.
The ANN model aims to predict the productivity and efficiency of the solar still,
which are affected by various factors such as solar radiation, ambient temperature,
wind speed, and water depth. The study proposes an ANN model that can
accurately predict the performance of solar stills, which can be used for the design
and optimization of solar stills for various applications in water desalination and
purification.[5]

Rahimi, N., Park, S., Choi, W., Oh, B., Kim, S., Cho, Y. H., ... &
Lee, D. (2023), In this article the objective is to provide a comprehensive
review of ensemble solar power forecasting algorithms. The study aims to
evaluate the performance of different ensemble methods in improving the
accuracy of solar power forecasting. The article discusses various ensemble
methods used in solar power forecasting, including simple averaging, weighted
averaging, bagging, boosting, and stacking. The review also discusses the
advantages and disadvantages of each method and provides insights into the
factors that affect the accuracy of ensemble forecasting models. The article
concludes by highlighting the importance of ensemble methods in improving the
accuracy of solar power forecasting. [6]

Kong, X., Du, X., Xu, Z., & Xue, G. (2023), Developing a predictive
model for solar radiation that can be used for space heating with thermal storage
systems. The authors propose a novel method based on a temporal convolutional
network-attention model, which incorporates both temporal and spatial
information from multiple weather variables. The model aims to improve the
accuracy and reliability of solar radiation prediction, and ultimately optimize the
performance of the thermal storage system. The study also evaluates the
effectiveness of the proposed model and compares it with other commonly used
machine learning algorithms.[7]

Nie, Y., Li, X., Scott, A., Sun, Y., Venugopal, V., & Brandt, A. (2023),
The objective of this paper is to introduce a new dataset called SKIPP'D (Sky
Images and Photovoltaic Power Generation Dataset), which includes high-
resolution sky images and photovoltaic (PV) power generation data. The paper
aims to demonstrate the potential of this dataset in improving short-term solar
forecasting accuracy. Specifically, the paper discusses the collection and pre-
processing of the dataset and presents a case study demonstrating the usefulness
of SKIPP'D for solar power forecasting using machine learning algorithms. The
ultimate goal of the paper is to contribute to the development of accurate and
efficient solar power forecasting methods. [8]
Bezerra Menezes Leite, H., & Zareipour, H. (2023), This article aims
to develop an accurate forecasting model for small behind-the-meter solar sites
that can predict energy production six days ahead. The authors aim to compare
the performance of different forecasting models and evaluate the impact of
weather forecast accuracy, solar site characteristics, and historical data
availability on the accuracy of the energy production forecast. The study focuses
on small solar sites that are connected to the distribution grid and have a capacity
of less than 500 kW, which are becoming increasingly popular due to their
potential to reduce greenhouse gas emissions and support distributed generation.
[9]

Ghimire, S., Deo, R. C., Casillas-Pérez, D., Salcedo-Sanz, S.,


Sharma, E., & Ali, M. (2022), proposed a novel deep learning CNN-LSTM-
MLP hybrid fusion model for feature optimization and accurate prediction of
daily solar radiation. The proposed model combines the strengths of
convolutional neural networks (CNNs), long short-term memory (LSTM)
networks, and multilayer perceptron (MLP) networks for effective feature
extraction and modelling of the complex non-linear relationships between the
input variables and solar radiation. The performance of the proposed model is
evaluated and compared with other state-of-the-art models using real-world solar
radiation data, demonstrating its superiority in terms of prediction accuracy and
robustness. [10]

Patel, D., Patel, S., Patel, P., & Shah, M. (2022), The objective of the
study is to develop a comprehensive and systematic approach for the estimation
of solar radiation and solar energy using artificial neural network (ANN) and
fuzzy logic concept. The study aims to optimize the ANN architecture using
various training algorithms and activation functions to improve the accuracy of
solar radiation and energy estimation. Additionally, the study aims to develop a
fuzzy logic-based inference system to enhance the estimation performance by
integrating expert knowledge and linguistic variables. The proposed approach
aims to provide an effective tool for solar radiation and energy estimation for
practical applications. [11]

Ghimire, S., Deo, R. C., Wang, H., Al-Musaylh, M. S., Casillas-


Pérez, D., & Salcedo-Sanz, S. (2022), Develop a stacked LSTM sequence-
to-sequence autoencoder with feature selection for accurate daily solar radiation
prediction. The study aims to review and improve upon existing models for
predicting solar radiation by incorporating a sequence to-sequence autoencoder
with a stacked LSTM model, which can capture nonlinear relationships and
temporal dependencies in the data. Additionally, the study employs feature
selection to identify the most relevant input variables and reduce the
dimensionality of the problem. The proposed model is evaluated and compared
with existing models to demonstrate its effectiveness in solar radiation prediction.
[12]

Khan, W., Walker, S., & Zeiler, W. (2022), In this study, they propose an
improved deep learning-based ensemble stacking approach for the accurate
forecast of solar photovoltaic (PV) energy generation. The study aims to develop
a model that can handle the nonlinear and complex relationships between the
variables affecting solar PV energy generation, by utilizing the strengths of
different deep learning algorithms in an ensemble stacking framework. The study
seeks to evaluate the performance of the proposed model and compare it with
existing forecasting models, to demonstrate its superiority in terms of accuracy
and robustness. [13]

Shams, M. H., Niaz, H., Hashemi, B., Liu, J. J., Siano, P., & Anvari-
Moghaddam, A. (2021), The author proposed a novel artificial intelligence-
based approach for predicting and analysing the oversupply of wind and solar
energy in power systems. The proposed approach utilizes machine learning
algorithms to forecast renewable energy generation and demand, and to detect
oversupply events. The approach is evaluated using real-world data from the Irish
power system, and the results demonstrate its ability to accurately predict
oversupply events, as well as its potential to support the integration of large
amounts of renewable energy into power systems. [14]

Sahu, R. K., Shaw, B., & Nayak, J. R. (2021), In this article, the author
proposed a system that optimize Extreme Learning Machine (ELM) is employed
to forecast real-time SPG of Chhattisgarh state of India by conceding weather
conditions for prediction. The study aims to improve the accuracy of solar power
forecasting by incorporating weather variables such as temperature, humidity, and
wind speed, along with historical solar irradiance data. The proposed model is
expected to aid decision-making in power system operations, energy trading, and
renewable energy integration planning in Chhattisgarh state. [15]
CHAPTER -4

OBJECTIVES

1. Developing a model that accurately forecasts the amount of solar power


generated given varying environmental conditions achieving a high level of
overall accuracy.
2. Identifying the input environmental parameters having most significant impact
on solar power generation.
3. Developing a user-friendly interface or dashboard that allows various
stakeholders, to easily access and interpret the predictions made by the ML
model.
CHAPTER -5

METHODOLOGY

5.1 Working of Neural Network

Perceptron:

A simple artificial neuron having an input layer and output layer is called a
perceptron.
What does this Neuron contain?

1. Summation function

2. Activation function

The inputs given to a perceptron are processed by Summation function and


followed by activation function to get the desired output.

Fig 5.1 Perceptron

This is a simple perceptron, but what if we have many inputs and huge data a
single perceptron is not enough right?? We must keep on increasing the neurons.
And here is the basic neural network having an input layer, hidden layer, output
layer.

Fig 5.2 Neural Network

We should always remember that a neural network has a single input layer,
output layer but it can have multiple hidden layers. In the above fig, we can see
the sample neural network with one input layer, two hidden layers, and one
output layer.As a prerequisite for a neural network let us know what an
activation function and types of activation function
Activation Function: The main purpose of the activation function is to
convert the weighted sum of input signals of a neuron into the output signal. And
this output signal is served as input to the next layer.
Any activation function should be differentiable since we use a backpropagation
mechanism to reduce the error and update the weights accordingly.
Types of Activation Function:

Fig 5.3 Types of Activation Function


Sigmoid:

1. Ranges from 0 and 1.

2. A small change in x would result in a large change in y.

3. Usually used in the output layer of binary classification.

Tanh:

1. Ranges between -1 and 1.

2. Output values are centered around zero.

3. Usually used in hidden layers.

RELU (Rectified Linear Unit):

1. Ranges between 0 and max(x).

2. Computationally inexpensive compared to sigmoid and tanh functions.

3. Default function for hidden layers.

4. It can lead to neuron death, which can be compensated by applying the Leaky
RELU function.

A neural network works based on two principles:


1. Forward Propagation

2. Backward Propagation

Let’s understand these building blocks with the help of an example. Here I am
considering a single input layer, hidden layer, output layer to make the
understanding clear.
Forward Propagation:

Fig 5.4 Forward Propagation

1. Each feature is associated with a weight, where X1, X2 as features and W1, W2
as weights. These are served as input to a neuron.

2. A neuron performs both functions. a) Summation b) Activation.

3. In the summation, all features are multiplied by their weights and bias are
summed up. (Y=W1X1+W2X2+b).

4. This summed function is applied over an Activation function. The output from
this neuron is multiplied with the weight W3 and supplied as input to the output
layer.

5. The same process happens in each neuron, but we vary the activation functions in
hidden layer neurons, not in the output layer.

Backward Propagation:

Let us get back to our calculus basics and we will be using chain rule learned in
our school days to update the weights.
Chain Rule:

The chain rule provides us a technique for finding the derivative of composite
functions, with the number of functions that make up the composition
determining how many differentiation steps are necessary. For example, if a
composite function f( x) is defined as:

……..Eq 5.1

Fig 5.5 Chain Rule

Let us apply the chain rule to a single neuron,

Fig 5.6 Chain Rule for a single Neuron

In neural networks, our main goal will be on reducing the error, to make it
possible we have to update all the weights by doing backpropagation. We need to
find a change in weights such that error should be minimum. To do so we
calculate dE/dW1 and dE/dW2.
…..Eq 5.2
…..Eq 5.3

…Eq 5.4

…..Eq 5.5

Fig 5.7 Change in weights

Fig 5.8 Procedure

Once you have calculated changes in weights concerning error our next step will
be on updating the weights using gradient descent procedure.
……Eq 5.7
Fig 5.9 New Weights

Forward propagation and backward propagation will be continuous for all


samples until the error reaches minimum value.

5.2 Calculations

Implementation in Scikit-learn:

For each decision tree, Scikit-learn calculates a nodes importance using Gini
Importance, assuming only two child nodes (binary tree):

…Eq

5.8

 ni sub(j)= the importance of node j

 w sub(j) = weighted number of samples reaching node

 C sub(j)= the impurity value of node j

 left(j) = child node from left split on node j

 right(j) = child node from right split on node j

sub() is being used as subscript isn’t available in Medium.


The importance for each feature on a decision tree is then calculated as:

……….Eq 5.9

 fi sub(i)= the importance of feature i

 ni sub(j)= the importance of node j

These can then be normalized to a value between 0 and 1 by dividing by the sum
of all feature importance values:

………Eq 5.10

The final feature importance, at the Random Forest level, is it’s average over all
the trees. The sum of the feature’s importance value on each trees is calculated
and divided by the total number of trees:

……….Eq 5.11

 RFfi sub(i)= the importance of feature i calculated from all trees in the Random
Forest model
 normfi sub(ij)= the normalized feature importance for i in tree j
 T = total number of trees
Number of neurons in each hidden layer:
Nh=number of neurons
Ni=number of input neurons
Ns=number of samples in training data set
α=an arbitrary scaling factor usually 2-10
Nh=Ns/( α*(Ni+No)) …….Eq 5.12
Total data =4213
Training data=75% of Total data
=>4213*0.75
=>3159.75
=>3160 approx

Ns=3160, Ni=20, α=4.5, No=1


Nh=3160/4.5(20+1)
Nh=33.43
Hence, we chose 32
We always take double the neurons of previous hidden layer in the next hidden
layer.
For Example, the number of neurons in first hidden layer is 32 (as calculated)
then the number of neurons in the next hidden layer will be 64.

5.3 Model Training


 We collected the data from Kaggle in the form of excel file which had 4213
rows and 20 input features.
Every feature contributes differently to the prediction value.
Fig 5.10 Input Features
 Split the preprocessed dataset into training and test sets, using 75% of the
data for training and the remaining 25% for evaluation.
 Use a random seed (e.g., 42) to ensure reproducibility of results.
 Apply feature scaling (standardization) to both input features and the target
variable.
 Train artificial neural network (ANN) models with various configurations:
 Architecture: Two hidden layers with 32 and 64 neurons, respectively.
 Activation function: ReLU (Rectified Linear Unit) in both hidden layers.
 Optimization algorithm: Adam optimizer.
 Loss function: Mean Squared Error (MSE).
 Training for 150 epochs to allow the model to converge

5.4 Evaluation Metrics


 Assess the performance of trained models using the following evaluation
metrics:
 Root Mean Squared Error (RMSE): Measures the average magnitude of
prediction errors.
…….Eq 5.13
where N is the number of data points, y(i) is the i-th measurement, and y ̂(i) is
its corresponding prediction.

 Training Loss: Represents the discrepancy between predicted and actual


values on the training dataset.
 Mean Absolute Error (MAE): Measures the average absolute difference
between predicted and actual values.

MAE = (1/N) Σ(i=1 to N) |y(i) – ŷ(i)|…….Eq 5.14

Where N is the number of data points, y(i) is the i-th measurement, and y ̂(i) is
its corresponding prediction.
 Training R-squared (R2): Indicates the proportion of variance in the target
variable explained by the model on the training dataset.
 Test R-squared (R2): Measures the model's ability to generalize to new,
unseen data on the test dataset.
R2= 1-(SSres/SStot) …..Eq 5.15
SStotal=SUM(yi-yavg)2……Eq 5.16
SSres=SUM(yi-y^i)2…….Eq 5.17
Where y(i) is the i-th measurement, and y ̂(i) is its corresponding prediction.

5.5 Model Comparison


The efficiency and performance of a deep learning model is significantly
dependent on the number of hidden layers.Present the results of the trained
models and compare their performance based on the evaluation metrics.
Summary of Model Performance
1. Model 1:
Training Metrics:
- Training Loss: 0.0448
- Training RMSE: 0.2117
- Training MAE: 0.1426
- Training R2: 82.62%
- Test Metrics:
- Test R2: 74.24%
Summary:
Underfitting: The model achieves a relatively low R2 score on both the training
dataset (82.62%) and the test dataset (74.24%). This suggests that the model
might be too simple and not capturing enough information from the training

data, leading to reduced performance on both training and test datasets. The
model may benefit from increased complexity or improved feature engineering.
2.Model 2
Training Metrics:
- Training Loss: 0.0880
- Training RMSE: 0.2967
- Training MAE: 0.2034
- Training R2: 91.2%
- Test Metrics:
- Test R2: 76.3%
Summary:
Balanced Performance: The model achieves a high R2 score on both the
training dataset (91.2%) and the test dataset (76.3%). This indicates that the
model has good generalization performance and is less likely to overfit the
training data.
3.Model 3
Training Metrics:
- Training Loss: 0.1569
- Training RMSE: 0.3961
- Training MAE: 0.2785
- Training R2: 95.3%
- Test Metrics:
- Test R2: 72.3%
Summary:
Overfitting: The model achieves a high R2 score on the training dataset (95.3%),
indicating good performance. However, there is a noticeable drop in the R2
score on the test dataset (72.3%), suggesting potential overfitting. The model
might be too complex and capturing noise in the training data, leading to
reduced performance on unseen data.

5.6 Selection of Best Model


Based on the evaluation of the trained models, Model 2 demonstrates the most
balanced performance with a high R2 score on both the training and test
datasets. It achieves a good level of accuracy without showing signs of
significant overfitting or underfitting. Therefore, Model 2 is selected as the best
ANN model for predicting solar energy output in this study.
CHAPTER-6
RESULTS AND DISCUSSIONS

The final result we obtained –


 Loss: 0.0885 ,
 Root mean squared error: 0.2974
 Mean absolute error: 0.2040.
 R2 Score of Whole Data Frame : 0.867086
 R2 Score of Training Set : 0.913460
 R2 Score of Test Set : 0.761501

Fig 6.1 Epochs vs Root Mean Square Error


The forecasted solar power output from the model is compared to the actual
observed values as outlined below

S.no Actual Predicted

Table 6.1 Actual Vs Predicted Solar Power Output

Fig 6.2 Predictions Vs Real Data


The three input parameters having the most significant impact on solar power
generation are shortwave radiation backward sfc, incidence angle and
azimuth.

Table 6.2 Impact of every Input Feature


CHAPTER 7
WEBSITE INTERFACE

WEBSITE LINK - https://solar-power-prediction.vercel.app/


CHAPTER – 7

CONCLUSIONS
1) Accuracy and Performance: Achieving an 86% accuracy in solar power
generation forecasts is commendable and marks a substantial contribution to
the field. It would be beneficial to delve deeper into the methodology that
led to such a high level of accuracy. Exploring the types of machine
learning algorithms or models used, the data preprocessing steps, and how
the model was trained (including any challenges faced during the training
process) can provide valuable insights. Additionally, comparing your
model's performance with existing models or benchmarks could further
emphasize its significance.
2) Feature Importance: Identifying the key environmental factors that influence
solar power generation offers a clear direction for both technological
improvements and research. It would be interesting to explore how these
factors were determined to be of significant importance. For instance,
discussing the feature selection methods or the statistical techniques used to
quantify the impact of each environmental parameter could enrich the
understanding of your model's workings. Moreover, considering the
negative impact of the incidence and azimuth angles, it would be insightful
to discuss potential strategies or technologies that could mitigate their
adverse effects on solar power generation.
3) User-Friendly Dashboard: Developing a user-friendly platform for
predicting solar power output is a crucial step in making your research
applicable in real-world scenarios. It would be beneficial to outline the
features of the dashboard, such as the types of input parameters required,
how the predictions are presented to the users, and any tools or
functionalities provided to interpret the results. Spotlighting user feedback
and case studies emphasizes the dashboard's utility for energy planners and
grid operators.
REFERENCES

[1]. Mishra, D. P., Jena, S., Senapati, R., Panigrahi, A., & Salkuti, S. R. (2023).
Global solar radiation forecast using an ensemble learning approach. International
Journal of Power Electronics and Drive Systems, 14(1), 496.

[2]. Deo, R. C., Ahmed, A. M., Casillas-Pérez, D., Pourmousavi, S. A., Segal, G.,
Yu, Y., & Salcedo-Sanz, S. (2023). Cloud cover bias correction in numerical
weather models for solar energy monitoring and forecasting systems with kernel
ridge regression. Renewable Energy, 203, 113-130.

[3]. You, L., & Zhu, M. (2023). Digital Twin simulation for deep learning
framework for predicting solar energy market load in Trade-By-Trade data. Solar
Energy, 250, 388-397.

[4]. Krishnan, N., Kumar, K. R., & Inda, C. S. (2023). How solar radiation
forecasting impacts the utilization of solar energy: A critical review. Journal of
Cleaner Production, 135860.

[5]. Immanual, R., Kannan, K., Chokkalingam, B., Priyadharshini, B., Sathya, J.,
Sudharsan, S., & Nath, E. R. (2023). Performance Prediction of solar still using
Artificial neural network. Materials Today: Proceedings, 72, 430-440.

[6]. Rahimi, N., Park, S., Choi, W., Oh, B., Kim, S., Cho, Y. H., ... & Lee, D.
(2023). A Comprehensive Review on Ensemble Solar Power Forecasting
Algorithms. Journal of Electrical Engineering & Technology, 1-15.

[7]. Kong, X., Du, X., Xu, Z., & Xue, G. (2023). Predicting solar radiation for
space heating with thermal storage system based on temporal convolutional
network-attention model. Applied Thermal Engineering, 219, 119574.
8]. Nie, Y., Li, X., Scott, A., Sun, Y., Venugopal, V., & Brandt, A. (2023).
SKIPP’D: A SKy Images and Photovoltaic Power Generation Dataset for short-
term solar forecasting. Solar Energy, 255, 171-179.

[9]. Bezerra Menezes Leite, H., & Zareipour, H. (2023). Six Days Ahead
Forecasting of Energy Production of Small Behind-the-Meter Solar Sites.
Energies, 16(3), 1533.

[10]. Ghimire, S., Deo, R. C., Casillas-Pérez, D., Salcedo-Sanz, S., Sharma, E.,
& Ali, M. (2022). Deep learning CNN-LSTM-MLP hybrid fusion model for
feature optimizations and daily solar radiation prediction. Measurement, 202,
111759.

[11]. Patel, D., Patel, S., Patel, P., & Shah, M. (2022). Solar radiation and solar
energy estimation using ANN and Fuzzy logic concept: A comprehensive and
systematic study. Environmental Science and Pollution Research, 29(22), 32428-
32442.

[12]. Ghimire, S., Deo, R. C., Wang, H., Al-Musaylh, M. S., Casillas-Pérez, D.,
& Salcedo-Sanz, S. (2022). Stacked LSTM sequence-to-sequence autoencoder
with feature selection for daily solar radiation prediction: a review and new
modeling results. Energies, 15(3), 1061.

[13]. Khan, W., Walker, S., & Zeiler, W. (2022). Improved solar photovoltaic
energy generation forecast using deep learning-based ensemble stacking
approach. Energy, 240, 122812.

[14]. Shams, M. H., Niaz, H., Hashemi, B., Liu, J. J., Siano, P., & Anvari-
Moghaddam, A. (2021). Artificial intelligence-based prediction and analysis of
the oversupply of wind and solar energy in power systems. Energy Conversion
and Management, 250, 114892.
[15]. Sahu, R. K., Shaw, B., & Nayak, J. R. (2021). Short/medium term solar
power forecasting of Chhattisgarh state of India using modified TLBO optimized
ELM. Engineering Science and Technology, an International Journal, 24(5),
1180-1200.

You might also like