0% found this document useful (0 votes)
148 views67 pages

Transmission Line Fault Detection, Syedazehranadeemdsai2

This document is the final project report submitted by Syeda Zehra Nadeem to the NED University of Engineering and Technology in partial fulfillment of the requirements for a Postgraduate Diploma in Data Science with Artificial Intelligence. The project aims to develop a system for detecting and classifying faults on transmission lines using deep learning techniques, specifically convolutional neural networks and long short-term memory models. The report describes the methodology, implementation including data collection and preprocessing, model building and training, and evaluation of results. The long short-term memory model achieved higher test accuracy and lower loss than the convolutional neural network, demonstrating its superiority for this application. The report concludes by acknowledging challenges in classifying certain fault types and providing recommendations for

Uploaded by

szehranadeem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views67 pages

Transmission Line Fault Detection, Syedazehranadeemdsai2

This document is the final project report submitted by Syeda Zehra Nadeem to the NED University of Engineering and Technology in partial fulfillment of the requirements for a Postgraduate Diploma in Data Science with Artificial Intelligence. The project aims to develop a system for detecting and classifying faults on transmission lines using deep learning techniques, specifically convolutional neural networks and long short-term memory models. The report describes the methodology, implementation including data collection and preprocessing, model building and training, and evaluation of results. The long short-term memory model achieved higher test accuracy and lower loss than the convolutional neural network, demonstrating its superiority for this application. The report concludes by acknowledging challenges in classifying certain fault types and providing recommendations for

Uploaded by

szehranadeem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

NED UNIVERSITY OF ENGINEERING & TECHNOLOGY

TRANSMISSION LINE FAULT


DETECTION AND
CLASSIFICATION USING
DEEP LEARNING
TECHNIQUES
DATA SCIENCE BATCH-2: FINAL YEAR PROJECT REPORT

SUPERVISOR: Dr. NAJEED AHMED KHAN

SYEDA ZEHRA NADEEM


6-8-2023
NED UNIVERSITY OF ENGINEERING AND

TECHNOLOGY Centre of Multidisciplinary

Postgraduate Programmes (CMPP)

Postgraduate Diploma (PGD) Programmes

FINAL PROJECT REPORT

A Project Report submitted in Partial fulfillment of the requirements for a Postgraduate Diploma in
Data Science with Artificial Intelligence (AI)

Syeda Zehra Nadeem


Name of Student:

Batch: Batch-2

Project Title: Transmission Line Fault Detection and Classification Using Deep Learning Techniques

Name of Supervisor: Dr. Najeed Ahmed Khan

________________________
Signature of Supervisor
CERTIFICATE

This is to certify that Mr. / Ms. Syeda Zehra Nadeem of batch II has successfully completed the PGD

project in partial fulfillment of requirements for a PGD in Data Science with Artificial Intelligence

(AI) (PGD Title) from NED Academy, NED University of Engineering and Technology, Karachi,

Pakistan.

Project Supervisor

Dr. Najeed Ahmed Khan

Instructor/Supervisor

NED Academy

Name, Designation, Organization


DECLARATIONS

I hereby state that this Project titled, Transmission Line Fault Detection and Classification

using Deep Learning Techniques, is my own work and has not been submitted previously by

me for taking any degree/ diploma from anywhere else in the world.

At any time if my statement is found incorrect, NED University of Engineering and

Technology has the right to withdraw this PGD.

Signature:

Student: Syeda Zehra Nadeem

Date: 08-06-2023
PLAGIARISM UNDERTAKING

I solemnly declare that the research work presented in this PGD Project titled: Transmission Line

Fault Detection and Classification using Deep Learning Techniques, is solely my research work

except where

the acknowledgment of the sources is made.

Signature:
Student: Syeda Zehra Nadeem

Date: 08-06-2023
Contents

ABSTRACT................................................................................................................................................ 28

Chapter 1: INTRODUCTION..................................................................................................................... 29

1.1. Background of Study .................................................................................................................. 29

1.2. Problem Statement ...................................................................................................................... 33

1.3. Aims and Objectives ................................................................................................................... 34

1.4. Limitations of Study ................................................................................................................... 35

1.5. Thesis Structure .......................................................................................................................... 35

Chapter 2: LITERATURE REVIEW.......................................................... Error! Bookmark not defined.

Chapter 3: METHODOLOGY .................................................................................................................... 37

3.1. Install Necessary Libraries ................................................................................................................... 41

Chapter 4: Implementation ......................................................................................................................... 46

4.1 Data Source ................................................................................................................................. 46

4.2 Exploration and Implementation of Data Analytics Tools .................................................................... 46

4.2.1 Data Acquisition ................................................................................................................................ 46

4.2.2 Data Cleansing and Modification ....................................................................................................... 48

4.2.3 Feature Extraction .............................................................................................................................. 51

4.2.4 Data Normalization ............................................................................................................................ 52

4.2.5 Label Encoding and One-hot Encoding ............................................................................................. 52

4.2.6 Train-Test Split .................................................................................................................................. 55

4.3 Deep Learning Algorithms.......................................................................................................... 56


4.3.1 Convolution Neural Network ............................................................................................................. 57

4.3.2 Long Short-Term Memory (LSTM) .................................................................................................. 60

Chapter 5: EXPERIMENTAL RESULTS AND DISCUSSION ................................................................ 64

5.1 Evaluation of Test Data ........................................................................................................................ 64

5.1.1 Convolutional Neural Network (CNN) Model .................................................................................. 65

5.1.2 Long Short-Term Memory (LSTM) Model ....................................................................................... 65

5.1.3 LSTM Vs CNN .................................................................................................................................. 66

5.2 Training and Validation Curves ............................................................................................................ 66

5.2.1 CNN Training and Validation Loss ................................................................................................... 66

5.2.2 Long Short-Term Memory (LSTM) Training and Validation Loss................................................... 67

5.2.3 Convolutional Neural Network (CNN) Training and Validation Accuracy ...................................... 68

5.2.4 Long Short-Term Memory (LSTM) Training and Validation Accuracy ........................................... 69

5.3 Classification Report ............................................................................................................................. 70

5.4 Confusion Matrix Of Class Prediction Vs. True Class ......................................................................... 72

5.5 Receiver Operating Characteristic (ROC) Curve.................................................................................. 73

5.6 Discussion ............................................................................................................................................. 73

Chapter 6: CONCLUSION AND RECOMMENDATION ........................................................................ 75

6.1. Conclusion ........................................................................................................................................... 75

6.2. Recommendations ................................................................................................................................ 75

CHAPTER 7 REFERENCES ..................................................................................................................... 77


ABSTRACT

It is pivotal for a modern society to have a power system that is reliable and robust. All sectors of

human civilization including industries, households, and businesses rely heavily on power

distribution. Any power disruptions can have everlasting repercussions. Hence, prompt detection

and classification of power system faults are crucial. The main focus of the research project is to

develop a fault detection and classification system through a deep learning approach that is

efficient and accurate.

The study highlights the importance of fault detection in power systems. Particularly, shunt

faults in electrical networks are prevalent. The need for advanced fault detection techniques is

discussed. Exploring the deep learning potential is the primary aim of the project, especially

concentrating on Convolutional Neural Networks (CNN) and Long Short-Term Memory

(LSTM) models.

A comprehensive approach to research methodology is extended, from data acquisition to

detailed pre-processing through data training techniques and evaluation. Both Long Short-Term

Memory (LSTM) and Convolutional Neural Network (CNN) models are showcased to detect and

classify the shunt faults effectively.

Due to higher test accuracy and lower test loss, the Long Short-Term Memory (LSTM) model

displays superiority. LLLG shunt fault classifying challenges are acknowledged throughout the

project due to their complex nature.

While highlighting the importance of work done so far, the study concludes by making future

recommendations for challenging fault classes. Increasing robustness and exploring ensemble

methods are recommended in this regard.


Chapter 1: INTRODUCTION

Background of Study

The role of power systems nowadays is of foremost importance, it provides the basic framework

for our society to run upon. The electricity distribution and supply sustenance our industries,

households, businesses, and especially our economy. The entire human ecosystem gravely

depends on uninterrupted and seamless Power Transmission Systems worldwide, otherwise, the

productivity and functionality of all sectors of the ecosystem are disturbed. After power

generation in the Generation station, high-voltage electricity is transmitted over long distances

using a network of Transmission lines, towers, and substations. High-voltage AC is used for

transmission because it reduces energy losses.

Figure 1,1 A typical power distribution and transmission line system


Transmission lines are the sole link between the power generation station and the distribution

centers, ensuring that Electricity flows throughout the world, provided that their operational

integrity remains intact. The performance of Transmission lines can be affected by several faults.

85 to 87% percent of power system faults occur on Transmission lines (M. Singh, 2011). For the

system to remain efficient, reliable, and fast; robust fault detection techniques must be

developed.

Faults that remain undetected can cause severe economic losses because of equipment damage,

production stoppage, and impending safety hazards. For sectors that are heavily dependent on

power supply, even a second-long disruption can cause substantial losses. Unattended faults are

potential hazards that can lead to fires and explosions.

Traditionally Fault detection and classification in transmission lines were done through a variety

of methods such as relay-based protection schemes, signal processing techniques, and expert

systems (Zhang, 2016). However, these methods have limited efficiency due to their limitations

in complex and ever-changing environments. The power systems have grown to be perpetually

more complex due to advancements in technology and changing demands, so adaptation of the

above-mentioned methods can be a feat.

Exploration for more advanced and efficient techniques, like employing artificial intelligence for

the detection and classification of faults in Transmission lines. AI has shown a commendable

result in solving problems in all sectors of life, whether it’s computer vision or language

processing. Deep learning, a subfield of Machine learning, has especially shown remarkable

progress in sorting out complex systems resembling the power system at hand.
Deep learning can automate learning and feature extraction from raw data without the need to do

feature engineering manually. There are a lot of previous research projects available where

approaches like neural networks graphical or neural, have been used to solve power system

problems. There is a catalytic interest in such an approach in the field of fault analysis as well,

given that deep learning models have continuously shown great effectiveness in tasks such as

text analysis and image recognition for the same field. Deep learning seems to be the one

promising solution to the above-mentioned problem (Ali Raza, 2020).

Professional knowledge is the only roadblock in the way to develop new fault diagnosis

methods. Its reliance can be only bypassed through deep learning technology that combines

feature extraction and classification. This way accurate fault detection and classification can be

performed without developing and adapting to highly complex and sophisticated techniques.

One such approach of applying deep learning technology for back-to-back MMC HVDC fault

classification has been applied through the employment of a convolutional neural network. The

neural network has been used for complex feature extraction from current and voltage readings

in the respective transmission lines (Qinghua Wang, 2020).

Transmission line faults can be divided into two Series (open conductor faults) and Shunt (short

circuit faults). For the sake of this project, we will focus on only Shunt Faults. From now on all

faults addressed are of Shunt type. 11 such faults are elaborated in Figure 1.2 below.
Figure 1.2 Types of faults in overhead Transmission lines

The occurrence of several types of faults in a Transmission line is as follows (Prerana P. Wasnik,

2019):

Three-phase short circuit – 3%

Line-to-line fault – 8-10%

Two line-to- ground fault – 10-17%

One line-to-ground fault – 70-80%

Transition condition initiates whenever a fault occurs which results in over currents, which often causes

lasting damage to the system. There needs to be a system in place for accurate and efficient

monitoring of the transmission network so that preventive measures can be adopted before any

serious complications occur (Prerana P. Wasnik, 2019). Conventional methods like Impedance-

based techniques and Relay-based protection systems have proved quite fruitful in the past for

fault detection and classification (Jalal Sahebkar Farkhani, 2020). However, their efficacy is

limited, as they are not able to accommodate the vast and overly complex power systems of
today. With the advancements in technology, a need arises for better, more adaptable, and

smarter Fault detection and classification techniques.

A subset of Machine learning is Deep learning, which employs complex neural networks to

unfold deep intricacies in data. Informed decisions are then made upon the information gathered

from the data. The innate ability of deep learning models to autonomously extract complex and

hierarchical features from raw data, combined with their scalability and adaptability, has sparked

immense interest in their application to power system fault detection and classification

challenges (Majid Jamil, 2011).

Ideally, a model developed should be so robust that by only entering the 3 phase voltages and

currents on a single timestamp, accurate detection and classification of transmission line fault

can be carried out. The project aims to explore the possibility of developing a deep learning

technique for the detection and classification of Transmission line faults, upon the foundation of

all the previous work done in the field.

Notably, most of the work in the fault detection area only accounts for 10 types of faults after

ignoring the LLLG fault (ABCG), This is mainly because unlike any other fault phase currents

spike rather than dipping. This is the research gap that we will largely entertain. To train the

model more accurately, all 11 shunt faults would be taken into account.

Problem Statement

Timely detection and accurate classification of Transmission Line (TL) faults is a critical

challenge for the current state of power transmission and distribution networks. Conventional

methods have often struggled to carry out real-time analysis, which results in grid stability being

compromised and extended downtime. It is necessary to develop a Deep learning (DL) solution
that is robust and capable of efficiently detecting and classifying power line faults, ensuring

immediate results that can minimize power disruption downtime. The system proposed intends

to take leverage of advanced Deep learning models to further enhance fault detection and

classification accuracy, resultantly tremendously increasing the reliability and efficiency of

transmission line overhead networks.

Aims and Objectives

The following are the aims and objectives of the project:

To analyze the dataset of the 11-shunt fault in Transmission lines and a no-fault condition,

extract required variables by using different data analytics tools and techniques.

Filling a research gap in Shunt faults by including ABCG fault in the classification

Develop Deep learning models for shunt fault classification.

Perform a comparative analysis of and Convolutional Neural Network (CNN) models.

Contributing towards power grid stability and resilience

1.4. Brief Methodology

Data Acquisition is the prerequisite of any Machine learning endeavor. The data set must be

reliably sourced, mimicking real-life conditions for our trained model to work effectively. For

this purpose, a robust MATLAB Simulink simulation for a Power system was created. Data for

all 11 fault conditions and a No-fault condition was acquired. The acquired dataset is then

meticulously pre-processed for subsequent analysis.

The project delves into multiple Deep learning models including Dense, Convolutional Neural

Networks (CNN), and Long Short-Term Memory (LSTM). The main objective remains to
provide the most efficient and accurate fault detection and classification model, hence

contributing towards the collective human effort of building and maintaining a better power

system.

Limitations of Study

The following are the limitations of the study project:

Limited real open-source data available.

Not accounting for open-source faults

Limited literature available

Thesis Structure

The thesis report is divided into six chapters to present a brief flow of literature. In particular,

the summary of each chapter is given below

Chapter 1: It is the introduction part of the thesis. In this chapter, the theme of inquiry is

introduced. The background of the study explains the significance of shunt fault classification

in Transmission lines. Moreover, the justification for the study is also explained.

Chapter 2: This chapter covers the literature review. In this chapter, numerous pieces of

literature are about fault classification in power systems. This chapter provides the foundation

regarding data analytics and multiple Deep learning models.

Chapter 3: This chapter explains the methodology section. In this chapter, the different

methods, tools, and techniques for feature engineering and feature selection are discussed.

This chapter also explains the intricacies and particularities of the deep learning models being

used.

Chapter 4:This chapter explains the implementation of the whole project from data

acquisition to model fitting


Chapter 5: In this chapter, the results and findings are discussed. In this chapter, the detailed

implementation steps are discussed related to data analytics tools, and Deep learning models.

Moreover, in this chapter, the data visualization results are also presented for the best-fitted

model.

Chapter6: This chapter shares the conclusion and future scope of the project. In this chapter,

only the major findings are discussed, and recommendations are also suggested based on the

findings.
Chapter 2 LITERATURE REVIEW

Sami and Junio (2017) have tried to predict gold prices. By using machine learning techniques, they

have analyzed twenty-two market variables. The study revealed that machine learning models

such as artificial neural networks and linear regression are the most appropriate techniques to

predict future gold rates. Hence the results are expected to be fruitful for investors and financial

institutions.

Makala and Li (2022) have used autoregressive integrated moving average (ARIMA) and

support vector machine (SVM) techniques to predict future rates of gold. In order to conduct

the study, the daily data from world gold council has been used from 1979 to 2019. The

analysis has used data up to 2014 for the training of the models. The data beyond 2014 used

validation. The results have shown that the support vector machine (SVM) outperformed

autoregressive integrated moving average (ARIMA) model.

Chukwu dike, et al. (2020) carried out artificial neural network technique to predict future

gold rates. The study has used monthly gold prices in US dollars from October 2004 to

February 2020. Artificial neural network model (ANN) is found to be adequate technique for

predicting gold prices. Study has further carried out graphical analysis to confirm accuracy of

the model. Predicted results have suggested fall in gold prices in future.

Arena, et al. (2021) have successfully tried to predict gold prices by using machine learning

algorithms. The study has used various economic indices from different countries and

businesses. Two models artificial neural network model and linear regression model has been

used for analysis.

Vidya and Hari (2020) have carried out research to predict future rates of gold. The study has

used LSTM Network and ANN to conduct the underlying analysis. It has been observed that
gold rates are nonlinear in nature. Hence graphically gold price swings have been represented

in the form of exponential curve. The study has used data of World Gold Council. Artificial

neural networks have been found the most appropriate model to dealt with nonlinearities in

the data.Results of the study have shown the best predictions for future gold rates.

Rady, et al. (2021) have taken in to account ARIMA, DT, RF and GBT models to predict

future gold rates. The study has used time series data of a monthly gold rates from Nov-1989

to Dec 2019. Researchers have tried to build comparison between underlying models to find

out the best forecasting technique. The study has revealed that results of RF were more

precise than those of DT, GBT and ARIMA models. In order to predict the gold prices RF has

been turned out the most appropriate forecasting technique.

Bingo, et al. (2020) have examined the association between gold rates and economic variables.

Underlying economic variables are termed as indicators of financial and geopolitical chaos.

The study has conducted multiple linear regression, support vector machine and auto

regression integrated moving average (ARIMA) algorithms. Results have revealed that auto

regression integrated moving average (ARIMA) model performed very well. It has been

suggested that during pandemics, investors should consider swings in historical gold rates in

order to capture the fluctuations in gold prices in akin time periods.


Hong and Majid (2021) have tried to predict gold rates by using two machine learning

algorithms. Both autoregressive integrated moving average (ARIMA), artificial neural

network (ANN) and LSTM models have been used for this purpose. In the underlying study

data of daily gold rates from world gold council has been used from 3 September 2018 to 30

October 2020. LSTM and ANN outperformed the ARIMA model been used for forecasting

gold price.

Sarangi, et al. (2021) have used various statistical and machine learning techniques to predict

the expected return on gold investment. The underlying study has tried to explore the efficacy

of a machine learning based hybrid model in order to predict the future gold rates. Artificial

neural network (ANN) model has been used to predict monthly gold rates in India dated from

January 2012 to June 2021. Results have revealed that ANN is the best model to predict

future gold rates.

Abdullah and Chena (2020) have carried out the study to predict gold prices by using machine

learning techniques. The study has used weekly time series data from the period of 1 January

2009 to and 1 June 2018. Data has been collected from investing.com website.

Autoregressive integrated moving average (ARIMA) model has been used for analysis. To

evaluate the accuracy of autoregressive integrated moving average (ARIMA) model,

researchers have used evaluation metrics namely mean absolute error (MAE) and mean absolute

percentage error (MAPE). It has been observed that larger the data available for prediction

more would be the accuracy in results. The underlying study revealed 99.22% accuracy in

forecasting results in the 416-weeks.

Chandaria and Suresh (1991) have tried to explore the appropriate machine learning

algorithms to predict gold rates in future. The study has obtained the monthly data of gold prices
in India dated from December 1999 to November 2019. The data has been collected from

website of Index mundi. The underlying study has used various machine learning techniques

namely linear regression, random forest, support vector regression and moving average

method. On comparison it has been found that the regression models are the most appropriate

models to predict future gold rates.

Yurts ever (2021) has tried to explore the performance of LSTM, Bi-LSTM and GRU to

predict future gold rates by using monthly data. The study has used economic indices such as

crude oil price, consumer price index, stock market index, effective exchange rate and interest

rate. LSTM has been turned out to be the best model.

Khan (2021) tried to use both linear and non-linear models such as auto-regressive integrated

moving average (ARIMA) and artificial neural network (ANN) to forecast gold prices. Hence

it was stated as ARIMA-ANN model. The study has collected data for Pakistan dated from 1

July 2003 to 1 June 2021. Data has been divided in to two parts. In the first part models have

been calculated while in second part they are evaluated. The study has taken in to account two

error metrics such as root mean square error (RMSE) and mean absolute error (MAE) to

estimate the models. Results have revealed that ANN outperformed ARIMA in terms of

predicting validity of models. Hence the findings of the underlying study have supported

ARIMA-ANN combination which delivered the best predictions compared to ARIMA and

ANN.
Chapter 3: METHODOLOGY

This section provides an insight into the selection of processes and techniques rudimentary to the

project to extract useful outcomes. This process describes the project steps along with necessary

algorithms and equations, which further helps to understand the project implementation.

3.1. Install Necessary Libraries

Initiate by installing the necessary libraries including keras, sklearn NumPy, pickle, and Pandas.

sklearn is the primary library for Deep learning model building. NumPy and Pandas provide

additional functionalities for data manipulation and pre-processing. An environment is created

for the project exploration to begin with.

3.2. Data Acquisition and Pre-processing

It is a pivotal phase in Deep learning that involves gathering data from the source, prepare it and

then organize into a format that is suitable to be taken in as input to model. The process involves

data sourcing from various mediums and sources. It is very important to acquire data to be

diverse, responsibly sourced and of high quality, since it directly impacts the performance of the

given model.

Once acquired, it is important to pre-process the data to make it suitable for use with deep

learning Pre-processing involves several tasks:

• Data Cleaning: Remove any inconsistencies, errors, or outliers in the data that might

adversely affect the model's performance.

• Handling Missing Values: Deal with missing values in the dataset, either by imputing

them using suitable techniques or removing the corresponding instances if the


missing values are significant.

• Normalization: Normalize numerical data to a standardized range to ensure that

variables with different scales do not bias the model's learning process. Common

normalization techniques include min-max scaling or z-score normalization.

• Categorical Variable Encoding: Convert categorical variables into a suitable format

for the regression model. This may involve one-hot encoding, where each category is

represented by a binary vector, or ordinal encoding, where categories are assigned

numerical values based on their order or significance.

3.3 Feature Engineering

The process aims to pick, transform and manipulate the acquired data into usable features, that
allows us to apply deep learning models. Panning out new features irrespective of the original
datatypes and limitations, the overall pro cess of feature extraction is explained in fig 5.1.

Remove
Raw Data/ Scaling & Filling Null
Redundant
Information Normalization Values
Data

Figure 3.1 Process of Feature Engineering

3.4 Feature Selection

Features are the input variables which are given to the deep learning algorithms. Feature selection reduces

the input variables, only to move forward only relevant data for the implementation of a layered network.

Essential feature selection is imperative to training an optimal model to avoid any redundancies in the

learning process. Abundance of features can lead to absolute chaos; models will learn to capture irrelevant
patterns and inculcate noise patterns. Therefore, right feature selection helps minimize noise and produce

better more reliable predictions.

Figure 3.2 Feature Selection

There are two types of Feature selection:

Supervised Models: The supervised feature selection model uses the output labelled class for

feature selection, in which the target variables are identified to increase the efficiency of

machine learning model. Subsequently, supervised feature selection model has three types:

Filter Method: In this type of method, features are selected or dropped based on their

correlations to the output. The filter method checks the features correlativity whether they

make positive or negative correlations with the output targets and further drop the redundant

features accordingly.

Wrapper Method: In wrapper method, the data is splited into subsets and then proceeded to

training a model. After analysing the performance of the model, the splited data further

undergoes addition and subtraction of features and continue the cycle of model training to get

the best combinations of features better accuracy.

Intrinsic Method: The intrinsic method is the combination the of the Filter and Wrapper

method to generate the best subset of features.

Unsupervised Models: The unsupervised feature selection model does not require the
output labelled class for feature selection. It works on unlabelled data.

3.5 Splitting the Data

After pre-processing the data, split it into a training set and a test set. The training set will be

used to train the regression model, while the test set will be used to evaluate its performance.

A common practice is to allocate around 80% of the data for training and reserve the

remaining 20% for testing. This split ensures that the model is trained on a sufficient amount

of data while still having unseen data for evaluation.

3.1.3. Model Building

Utilize TensorFlow's Keras API, which provides high-level abstractions for building neural

networks, to construct the regression model. The model's architecture needs to be defined,

including the number of layers, the number of nodes in each layer, and the activation function

used in each node. The choice of architecture depends on the complexity of the problem and

the available data. For a simple regression model, a single layer with one node and no

activation function may suffice. However, for more complex relationships and patterns,

multiple layers with varying numbers of nodes and appropriate activation functions, such as

ReLU or sigmoid, can be employed to capture nonlinearities in the data.

3.1.4. Model Training

Train the regression model using the training set. TensorFlow provides optimization

algorithms, such as stochastic gradient descent (SGD) or Adam, to update the model's
parameters iteratively and minimize the loss function. The loss function measures the

discrepancy between the predicted values and the actual values in the training data. During

training, the model learns to adjust its parameters to minimize this discrepancy and improve

its predictive accuracy.

3.1.5. Model Evaluation

Evaluate the performance of the trained regression model using the test set. Calculate

relevant metrics, such as mean squared error (MSE), root mean squared error (RMSE), mean

42

absolute error (MAE), or R-squared (R2), to assess the model's accuracy and predictive

capability. These metrics provide insights into how well the model generalizes to unseen data

and its ability to make accurate predictions. Additionally, visualizations such as scatter plots

or residual plots can help analyze the model's performance and identify any patterns or

discrepancies.

3.1.6. Model Refinement

Iterate on the model by fine-tuning its architecture, hyperparameters, and preprocessing steps

to improve its performance. This may involve adjusting the number of layers and nodes,

changing the activation functions, modifying the optimization algorithm, or incorporating

regularization techniques such as dropout.


Chapter 4: Implementation

This chapter discusses discuss the implementation and results of the project based on different

design parameters, models, and algorithms as discussed earlier in Chapter#3. The

implementation process declares the steps of the project which are designed to ascertain the

desired goals. It also helps to understand the project alternatives. The results are examined

based on the implementation process for the success and failure tendency of the project.

Data Source

There is a lack of open-source data regarding shunt faults. Most of the work done in this domain

relies on synthetic data generated through simulation that mimics real conditions (Khaoula

Assadi, 2023) . The same approach is extended over here, a dataset is synthesized using real-

world-based simulations.

4.2 Exploration and Implementation of Data Analytics Tools


4.2.1 Data Acquisition

There is a lack of open-source data. Most of the work done in this domain relies on synthetic

data generated through simulation that mimics real conditions (Khaoula Assadi, 2023) . The

same approach is extended over here.

To enable the evaluation of the deep learning approach a simulated dataset is generated through a

Simulink simulation of a power system. A typical three-phase power system of 735kV at

frequency 60Hz, using six 350 MVA generators as source, delivers power from the generating

station to a network connected to variable load through a transmission line of 300 km length

operating at the base voltage of 735 kV and base power of 100 MW. The first bus B1 is on the
generation side, second bus B2 is on the load side. CB1 and CB2 are the two line-circuit

breakers.

Figure 4.1 MATLAB Simulation of a real-time power distribution system, used to simulate shunt faults in the system

Voltages and Currents are measured on both buses B1 and B2. The Fault Breaker block is used

to execute the desired fault on the transmission line at the desired resistance and length. The

circuitry has the following key parameters:

Base voltage: 735 kV

Base power: 100 MW

Sample time: 10 seconds

Fault occurrence: Simulated at 0.1667 seconds

Data recording: Three-phase voltages and currents are recorded per millisecond after 2 seconds,

representing the period of purely faulted conditions.


The process for data generation is repeated 11 times for each short circuit condition and one time

with no fault (perfect condition). The acquired data is then stored as a MATLAB workspace

under variables named after the files. they are then converted into individual csv files for each

fault, due to its compatibility with other Machine-learning frameworks. Various range of fault

scenarios and fault references are represented through these to further dive into training a

suitable model.

4.2.2 Data Cleansing and Modification

All 12 csv files representing each shunt fault are imported as Pandas Data Frames. It is important

to note here that since the fault is generated at 0.167th seconds of the total 10 seconds where the

data is recorded every 0.000005th seconds there are 2000001(1 additional entry accounting for

T=0 second) entries per csv file. We skip the first 4000 rows per csv file to take only fault data

into account that is the last 8 seconds of data.

Figure 4.2 Importing data from the MATLAB exported csv files

A new column representing each fault type is introduced for each data frame. To differentiate

between fault data all these data frames will be concatenated into one data frame.
Figure 4.3 Adding a new fault column to all the data frames

All data frames contain a timestamp (Time), the three-phase voltage reading (Va, Vb, Vc), the

three-phase current reading (Ia, Ib, Ic), and lastly the shunt fault type (Fault). The

data frames are ready to be concatenated into a single data frame.

Figure 4.4 Head of ABCG data frame

The data is acquired at 60hz (time period comes out to be 0.167sec). All 8 sets of data contain

480 times of a single wave cycle repeated for all the voltages and currents. 0.2 sec to 0.3 sec is

visualized in the figures below.


Figure 4.5 All the three-phase voltages and currents are plotted for a second per each fault type. Each fault type has a very
distinct impact on the signal except for CG and ABCG which are quite similar.

All 12 data frames need to be combined into one to achieve the goal of applying Deep learning

techniques to the data. The only approach is to concatenate row-wise all 12 data frames. Using

the pd.concat command of the pandas library and setting the axis=0 for row-wise concatenation.

Since concatenation is a linear process and it is required to spread the data evenly throughout and

prevent biases in the random batch and later into the model training, the obtained data set is then

shuffled. Shuffling is done through the Dataframe.sample command with parameter frac=1

(frac stands for what fraction of data is to be sampled, it is sampled between 0 to 1 where 1

means all of the data) to sample all data and ends up with a shuffled data frame carrying all faults

data.
Figure 4.6 The data now has 9 columns x 2352012 rows, with the last column representing our target column which is Fault

4.2.3 Feature Extraction

It is a crucial step in the field of Machine learning (ML) and Deep learning in particular. Raw

data is transformed into a format well-suited for model training. This process is particularly

pivotal as data that is typically encountered is highly dimensional and complex in nature images,

audio, or text.

The electrical parameters of time and the three-phase voltages and currents, namely 'Time', 'Va',

'Vb', 'Vc', 'Ia', 'Ib', and 'Ic’ were extracted from the data frame as Features for the Deep

learning and saved in an array named X. These features collectively provide comprehensive

insights into the system's behavior and are crucial for fault detection and classification. The

target column ‘Fault’ was saved into an array named ‘y’.


Figure 4.7 Feature extraction and labels through code

4.2.4 Data Normalization

Data normalization is a very important pre-processing tool, especially with feature extraction in

deep learning. Features are scaled and transformed into a uniform distribution aiding the neural

networks to train and perform better with more stability (Singh, 2019).

Normalization was applied to the feature set to ensure that all features were on the same scale.

This process enhances the model's convergence during training and avoids the dominance of

certain features due to their larger magnitudes. The StandardScaler from the sklearn.The

preprocessing module was used to normalize the features, resulting in a dataset with zero mean

and unit variance.

4.2.5 Label Encoding and One-hot Encoding

It is a rudimentary tool in Machine learning and Deep learning, particularly in cases where

categorical data is dealt with. Any discrete, non-numeric values are termed as categorical e.g.

product categories or types of transactions. A key component of Deep learning is a neural


network that only operates on numerical data. So, all Categorical variables have to be

transformed into a numeric format for a model to comprehend them.

Label encoding is the process of transforming such categorical data into numerical format. Each

unique category is assigned a unique whole number, essentially creating a category to their

corresponding numeral label mapping (Sebastian Raschka, 2020) e.g., if we have a categorical

array {cat, dog, bird}, it might get assigned numeral labels such as {0, 1, 2} respectively.

The 'Fault' column, representing categorical fault labels, was encoded using label encoding.

This transformation converts categorical labels into numerical values while preserving the

ordinal relationship among different fault categories. The LabelEncoder class from the

sklearn.preprocessing module facilitated this encoding.

There remains a chance of ordinal relationships being created between label-encoded categories.

To resolve this, one hot encoding is applied. One hot encoding assigns a unique index for each

category, hence each category has a binary vector corresponding to it. It ensures that each

category has an isolated dimension, so rather than creating ordinal relations, the neural network

can learn to associate patterns with each category.

The Label-encoded target columns are one hot encoded through the to_categorical command

from the sklearn.preprocessing module. Turning ordinal format into binary vectors.
Figure 4.8 Transition of target values from categorical to ordinal into binary vectors

y y_encoded y_onehot

No_Fault 11 [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]

CG 10 [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.]

AB 0 [1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]

ABC 1 [0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]

AG 6 [0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.]

BCG 8 [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.]

ABCG 2 [0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.]

ACG 5 [0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.]

ABG 3 [0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.]

BC 7 [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.]

AC 4 [0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.]

BG 9 [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]
The table above shows the key to how each fault now has a corresponding binary vector. After

feature extraction, data normalization, and one hot encoding, the data is in a suitable form to be

fed to a deep learning model. There is only one step left.

4.2.6 Train-Test Split

To train a neural network it is exposed to a dataset holding output-input pairs, which results in

the network learning the underlying relationships and patterns within the dataset. However, it’s

important to evaluate the performance of the trained model on new unseen data, to see if it

generalizes well to novel data. So, data must be subjected to a train-test split.

Data is split into two mutually exclusive subgroups i.e., train set and test set. The train set is only

used to train the model while the model is oblivious to the test set. The test set is only introduced

to the model after training to evaluate the performance of the model generally on novel data.

furthermore, the train and test set should be representative of the overall data distribution, to

prevent any model’s skewed and biased performance. This is ensured through the data

normalization step before.

For the finetuning of hyperparameters during the training process, an additional subset might be

sometimes needed called a validation set. It helps make important decisions about the model

architecture. It is a practice set for the model before being evaluated on an unseen testing set.

To evaluate the model's performance, the dataset was split into training and testing subsets using

a widespread practice of an 80-20 split. The normalized feature matrix X_normalized and the

encoded target vector y_encoded were divided into X_train, X_test, y_train, and y_test using

the train_test_split function from sklearn.model_selection. This separation ensures that the

model's performance can be evaluated on unseen data.


Figure 4.9 train-test split code

The stratify parameter ensures that the split maintains the same proportion of classes in the two

subsets as present in the original data. This is highly crucial in cases like the shunt fault dataset

with a large number of classes. Fixing random_state to a number will ensure reproducible

results each time the split happens.

In summary, the pre-processing pipeline encompasses data cleaning, feature extraction,

normalization, and label encoding. These steps prepare the dataset for training and evaluation,

enabling the deep learning model to effectively learn and generalize from the data.

Deep Learning Algorithms

For accurate and robust results, it’s important to select a well-suited model for the dataset

classification. The process is resource-intensive and intrinsically iterative, demanding well-

rounded explorations through various models and hyperparameter tuning (R. Fan, 2019). The

data at hand with nuanced features needs a model with complexity well versed in our dataset.

Two architectures of deep learning have emerged as promising in this context, the Convolutional

Neural Network (CNN) and the Long Short-Term Memory (LSTM) network. They were chosen

specifically for their unique capacity of accommodating the challenges imposed by our data.

By concentrating on the two selected architectures, the aim is to deeply evaluate their respective

performance and discern the superior model that is better at modeling underlying scenarios of the

data. the two models are selected on the criteria that it not only effectively fits the data but also a

solid base is provided to achieve reliable and meaningful results


4.3.1 Convolution Neural Network

It is a type of Artificial Neural Network, designed to process and analyze grid-like data as in

image-based data. It is a remarkably effective architecture for fault detection and classification

(Anshuman Bhuyan).

4.3.1.1 Data Reshaping

The data was reshaped into an image like the 2D structure of an image as Convolutional Neural

Network (CNN) only takes data in image form.

Figure 4.1 2D dataset converted to 3D image-like shape

The network is designed to take input data in the shape (number of features,1), the data has 7

features. X_train_reshaped and X_test_reshaped are reshaped from 2D arrays into 3D arrays of

shape (the number of samples, the number of features,1) respectively. it is a very common

reshaping format for such architectures.

4.3.1.2 Architecture Design

A well-suited Convolution Neural network-based architecture is designed for the data.


Figure 4.2 Detailed CNN Model Architecture Properties

1D Convolutional Layer: This foundational layer has 32 filters. The input data is to be scanned

by a kernel of size 3, capturing local patterns of data efficiently. Nonlinearity is infused into the

network through the activation function ‘relu’ (rectified linear unit), which is quite hands-on in

configuring complex spatial relations.

Max Pooling Layer: Convolution layer followed by a max pooling layer. It beautifully decreases

data dimensions, down downsampling while keeping all the essential data information. The pool

size of 2 effectively reduces data without compromising the model’s ability to capture salient

features.

Batch Normalization Layer: As the depth of the network increases plenty of problems such as

gradient vanishing arise, which can lead to slow and stalled learning and difficulty in updating
weights. Batch Normalization is a strategic decision to normalize out of each layer and combat

all network depth problems.

Flatten Layer: Output of earlier layers must be flattened (reshaped) to be made ready to be taken

as input for the next layer.

Dense Layers:128 units’ Dense Layer is employed to create sophisticated spatial patterns. The

non-linearity factor is once again introduced through the ‘real ’ activation function for unfolding

relations within data.

Dropout Layer: They are generally used to prevent overfitting by randomly deactivation neuron

proportions during training of the model. The network’s ability to generalize is enhanced by a

rate of 0.5 dropout, this ensures robust performance on unseen data.

Output Layers: The ‘softmax’ activation function is used in the final Dense layer. The activation

function converts the vectors into target class probabilities. It creates a distribution over all fault

classes for a multiclass classification.

The Convolutional Neural Network (CNN) model is then compiled with ‘categorical cross-

entropy loss, which measures the difference between the predicted class probability and the true

class probabilities. The optimizer of choice is ‘Adam’ (Adaptive Moment Estimation), It adapts

the learning rates of individual model parameters based on their historical gradients and squared

gradients, helping to balance the speed of convergence and the stability of the optimization

process.

# Compile the model

cnn_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


Two Callback functions are also used. ‘Early Stopping’ and ‘ReduceLROnPlateau’ to enhance

model training and performance.

Early Stopping: The validation loss metric is monitored and the learning is halted if there is

improvement in the metric, this prevents overfitting and conserving computational power. The

best weights found during training are restored.

ReduceLROnPlateau: The validation loss metric is monitored and whenever the metric plateaus

the learning rate is dynamically reduced. This adds to the finetuning of the model to fit the data

better

# Train the model

early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, verbose=1,

min_lr=1e-6)

cnn_model.fit(X_train_reshaped, y_train_encoded, batch_size=64, epochs=50, validation_split=0.2,

callbacks=[early_stopping, reduce_lr])

The data is trained in the batches of size 64 over 50 epoch cycles. The meticulously selected

techniques, parameters, and layers combined contribute to the efficient and robust nature of the

Convolutional Neural Network (CNN) model for grid-based data analysis particularly for fault

detection and classification.

4.3.2 Long Short-Term Memory (LSTM)

It is a type of Recurrent Neural Network (RNN). It is capable of handling gradient vanishing

problems, hence remarkably effective in training Time series data. Special memory cells store
information for extended periods to prevent loss of context. 3 gates: input, forget, and output;

govern the memory cells. The input gate decides which added info to incorporate while the

forget gate decides which info should be discarded Finally the output gate decides which info has

to be removed from the cell. TL fault problem has Time series data, so Long Short-Term

Memory (LSTM) seems to be an excellent choice (Fezan Rafique, 2021). Inherent temporal

tendencies of the data are the center of the architecture created. The network is designed as

follows:

Figure 4.3 Detailed LSTM Model Architecture Properties

Long Short-Term Memory (LSTM) Layers: 64 units of LSTM layer are pumped by the “tanh”

activation function. Tanh introduces nonlinearity to the gated memory cell mechanism of LSTM.

Thus, enabling the network to capture complex relations in sequence data. Return sequences

prepare output for the next layer for seamless sequence-based analysis.

Dropout Layer: To prevent overfitting and add to the network’s prowess, a 0.2 rate dropout is

set in this layer. This is a regularization mechanism.


Long Short-Term Memory (LSTM) Layers: Another 32 units of LSTM fuelled by the ‘tanh

’activation function. This set of lstm layers tracks temporal patterns in the provided data.

Dropout Layer: Its own as another regularization mechanism, preventing overfitting and

harmonizing the network further.

Dense Layer: The final layer with the help of ‘softmax’ activation. Prediction vectors are

converted into target class probabilities over all classes of targets.

The Long Short-Term Memory (LSTM) model is then compiled with ‘categorical cross-entropy

loss, which measures the difference between the predicted class probability and the true class

probabilities. The optimizer of choice is ‘Adam’ (Adaptive Moment Estimation), It adapts the

learning rates of individual model parameters based on their historical gradients and squared

gradients, helping to balance the speed of convergence and the stability of the optimization

process.

# Compile the model

lstm_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Two Callback functions are also used. ‘Early Stopping’ and ‘ReduceLROnPlateau’ to enhance

model training and performance.

Early Stopping: The validation loss metric is monitored and the learning is halted if there is

improvement in the metric, this prevents overfitting and conserving computational power. The

best weights found during training are restored.

ReduceLROnPlateau: The validation loss metric is monitored and whenever the metric plateaus

the learning rate is dynamically reduced. This adds to the finetuning of the model to fit the data

better
# Define callbacks

early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, min_lr=1e-6)

# Train the model

lstm_model.fit(X_train_reshaped, y_train_encoded, batch_size=32, epochs=30,


validation_split=0.2, callbacks=[early_stopping, reduce_lr])
Chapter 5: EXPERIMENTAL RESULTS AND DISCUSSION

In the area of Deep learning, to achieve accurate and reliable results across a diverse dataset, it is

important to select and fine-tune an appropriate model. In this project, we selected two very

powerful neural networks and rigorously examined them, namely the Convolution Neural

Network (CNN) and the Long Short-Term Memory (LSTM) network. The primary objective of

meticulous evaluation of their performance on carefully curated datasets is achieved. Delving

into a comparative analysis we shed light on the distinct strengths and bounding limitations of

the respective models. Rigorous testing on the obtained test set decides which architecture

outshines in the specific context.

Furthermore, valuable insights will be provided for further research in the field. Through this

endeavor, it is aspired to find the most suitable architecture for the shunt fault detection and

classification in the Transmission lines.

5.1 Evaluation of Test Data

Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) network were

evaluated on the test dataset and multiple performance metrics such as validation loss were recorded.

Figure 5.1 CNN and LSTM Test Data Evaluation


The provided output appears to be the result of training two different neural network models, a

Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) network. These

models were evaluated on a test dataset, and the performance metrics such as loss and accuracy

were recorded.

5.1.1 Convolutional Neural Network (CNN) Model

The Convolutional Neural Network (CNN) was trained on 14,701 samples from the dataset. It

took 61 seconds per epoch approximately for the training process to complete. The average error

between the predicted and actual values was calculated by the final loss on the set which came

out to be 0.1441. Better performance is signified by lower loss. Accuracy of 90.96% was

achieved on the test set, this percentage represents the correctly classified sample percentage.

5.1.2 Long Short-Term Memory (LSTM) Model

The Long Short Term Memory (LSTM) model was also trained on 14,701 samples from the

dataset. It took 283 seconds per epoch approximately for the training process to complete. The

average error between the predicted and actual values was calculated by the final loss on the set

which came out to be 0.1156. Better performance is signified by lower loss. An accuracy of

91.67% percentage was achieved on the test set, this percentage represents the correctly

classified sample

However, the training process for the Long Short-Term Memory (LSTM) model was more

computationally intensive, taking approximately 283 seconds per epoch. This suggests that Long

Short-Term Memory (LSTM) models might require more time to process sequential data due to

their complex architecture.


5.1.3 LSTM Vs CNN

In summary, based on the provided information, the Long Short-Term Memory (LSTM) model

outperformed the Convolutional Neural Network (CNN) model in terms of both loss and

accuracy on the test dataset. This could imply that the dataset contains temporal dependencies

that the Long Short-Term Memory (LSTM) network can capture effectively.

Long Short-Term Memory (LSTM) might seem to be the superior model here but computational

intensiveness is its strongest limitation, 61 seconds and 283 seconds per epoch is a major

difference if looked at in the grand scheme of things

However, it's important to note that these results do not provide insights into the broader context

of the task or the specific dataset used, and further analysis would be required to draw more

robust conclusions.

5.2 Training and Validation Curves

To examine the training and validation processes in detail their progress should be visually

represented. This could be achieved through loss and accuracy being plotted over epochs.

5.2.1 CNN Training and Validation Loss

Training loss starts higher up at 0.3568, it declines to 0.25 within the next 10 epochs, this steep

decline represents a great learning process over the training data. From 10 to 20 epochs, there is

only a drop of 0.5 as the learning stabilizes, this stable not so steep decline continues for the next

30 epochs as the test loss settles at 0.1617. Notice the sudden drops that appear whenever the

Learning Rate is reduced due to call-backs; these occur specifically at 23, 39, and 47 epochs.

Validation loss starts lower than Training loss at 0.2387, it is starting at a stage training loss

reached after about 12 epochs. There is a steady decline through epochs, with a sudden drop at
23, 39, and 47 due to LRonPlateau call-back function. At 50 epochs validation loss comes out to

0.1440

Figure 5.2 CNN training and Validation loss curves

5.2.2 Long Short-Term Memory (LSTM) Training and Validation Loss

Training loss starts higher up at 0.1409, it declines to 0.1164 within the next 5 epochs, this steep

decline represents a great learning process over the training data. From 5 to 10 epochs, there is

only a drop of 0.0004 as the learning stabilizes, this very minuscule decline continues for the

next 20 epochs as the test loss settles at 0.1156.

Validation loss starts lower than Training loss at 0.1162, it is starting at a stage training loss

reached after about 5 epochs. Interestingly enough the validation loss first increases and then

decreases both the loss curves converges into one at 8 epochs, indicating that the model is well

fitted for both validation and test sets


Figure 5.3 LSTM Training and Validation Loss Curves

5.2.3 Convolutional Neural Network (CNN) Training and Validation Accuracy

Training accuracy starts at 0.8459, which keeps increasing steadily over the next 50 epochs to

0.9046, although there is only a 0.5 increase over 50 epochs it is still considered a good accuracy

curve, Validation accuracy starts at 0.8810 and steadily increases to 0.9095. although both curves

show the same curvature, they don’t converge into one at any point. The gap maintains that the

validation and training accuracy aren’t matching hence the model is a little skewed.
Figure 5.4 CNN Training and Validation Accuracy

5.2.4 Long Short-Term Memory (LSTM) Training and Validation Accuracy

Long Short-Term Memory (LSTM) training and validation curves don’t show any progress they

continuously oscillate and end up at values exactly where they begin at 0,916. The curves meet at

several points but don’t converge at any point


Figure 5,5 LSTM Training and Validation Accuracy

Upon close observation of all the curves, Long Short-Term Memory (LSTM) curves seem to be

more divergent, and hence Long Short-Term Memory (LSTM) is a slightly better model. We

can't conclude only by looking at curves further exploration is still required to draw valid

conclusions.

5.3 Classification Report

Classification reports are a summary of model performance on a dataset. It has a set of

parameters that measure how well data is being classified into different classes. It is especially

useful for categorical data classification.

Typically, the following metrics are added to the report:

Precision: Measure all the true predictions made by the model by taking the ratio of true

positives over the sum of all positives

Figure 5.6 Precision calculation equation

Recall: Measures the ability of the model to catch all the positive instances by taking a ratio of true

positives over the sum of true positives and false negatives.

Figure 5.7 Recall calculation equation

F1-score: It is a measure of both precision and recall by calculating their harmonic mean.
Figure 5.8 F1-score equation

Support: The number of samples of data processed.

Accuracy: It measures how accurate the model classifies by taking the ratio of the sum of true

positives and negatives, over a total number of samples.

Figure 5.9 Accuracy equation

Macro Average: Average of F1 score with no weights.

Weighted Average: Average of F1 score weighted by the number of samples of each class.

Long Short-Term Memory (LSTM) has a higher precision and recall generally as most of them

are 1, This proves that LSTM is more correctly classifying the faults. A higher F1 score in the

case of Long Short-Term Memory (LSTM) shows a good precision and recall balance being

maintained.

Figure 5.10 Classification Report of CNN vs. LSTM


5.4 Confusion Matrix Of Class Prediction Vs. True Class

Confusion matrix is a table of actual class vs. predicted class that describes the deep learning

model on test data over known values. It is particularly pivotal for multiclass classification where

you can’t gauge model performance by only looking at accuracy. It allows a visual into the

detailed performance of the network.

In terms of class-specific performance, as discussed in the Introduction section LLLG shunt

faults have complex structures to classify which is clear due to the low precision and recall

scores. An LG fault (CG) also shows low scores. To further investigate this behavior Confusion

matrix of class prediction vs. true class is created. All classifications show minuscule miss

classification except for the two discussed above.

Figure 5.11 Confusion Matrix of CNN vs. LSTM

The conclusion is that while most faults have a high probability of being correctly classified,

ABCG and CG are often confused with each other.


5.5 Receiver Operating Characteristic (ROC) Curve

It plots the true positive rate vs. false positive rate for various thresholds. The Area Under the

Curve (AUC-ROC) is the most common threshold to measure the classification performance of

a matrix

Figure 5.11 ROC of CNN vs LSTM

It proves that ABCG and CG fault types have a lower Area Under Curve (AUC) of 0.95 which

ideally should be 1. Hence, their classification is non ideal.

5.6 Discussion

The project jumped into two Deep learning models Convolutional Neural Network (CNN) and

Long Short-Term Memory (LSTM) for shunt fault detection and classification in Transmission

lines. Through careful study of the experimental results, valuable insights into the strengths and

limitations of the two models have been gained.

Long Short-Term Memory (LSTM) outperforms Convolutional Neural Network (CNN) upon
evaluation of the test dataset in terms of accuracy and loss. Hence, it has a more effective

tendency to catch temporal patterns of the dataset. The limiting factor remains that it takes 283

long seconds per epoch and hence computationally pretty intensive and practically inconvenient.

Furthermore, the meticulous analysis of learning curves provided an depth understanding of the

learning journey of both models. In the case of Convolutional Neural Network (CNN), there is a

steady decline in both the losses representing the robustness of the process. On the contrary, the

Long Short-Term Memory (LSTM) model training loss curve displayed rapid initial decline but

some fluctuations of overfitting were experienced by validation loss before stabilizing.

The Classification Report, Confusion Matrix, and Receiver Operating Characteristic (ROC)

curve analysis showed an overall very high rate of true classification, with two exceptions. The

fault types ABCG and CG have comparatively lower Area Under the Curve (AUC) scores and an

overall higher rate of misclassification. This indicates suboptimal classification for these two

fault types. To conclude, the superior performance of Long Short-Term Memory (LSTM) in fault

classification is tainted by the high computational intensity.


Chapter 6: CONCLUSION AND RECOMMENDATION

6.1. Conclusion

To craft an effective and accurate Transmission Line shunt Fault Detection and

Classification system utilizing a deep learning approach. The 11 shunt Faults are the

main focus for which an encompassing methodology is developed. The data was

acquired and run through complex processes to find well-suited models.

A Fault dataset was generated through a real-time MATLAB Simulink simulation. The

acquired dataset undergoes rigorous pre-processing, including data cleaning, feature

extraction, normalization, and label encoding (Venkatesh, 2018). This prepared the data

for training and evaluation of the deep learning models.

The experimental results showed that both the Convolutional Neural Network (CNN)

and Long Short-Term Memory (LSTM) models were capable of effectively detecting

and classifying transmission line faults. However, the Long Short-Term Memory

(LSTM) model shows slightly better performance compared to the Convolutional Neural

Network (CNN) model, achieving lower test loss and higher test accuracy. This

superiority of the Long Short-Term Memory (LSTM) model is attributed to its ability to

capture sequential dependencies present in the time series data.

Out of 11 shunt Faults two: ABCG and CG, proved to be challenging classifying. These

showcased low recall scores and precision. Additionally, lower AUC for the said faults

further cements the imperfect classification.

6.2. Recommendations
Further research is highly recommended in terms of the two more challenging faults. Several

small changes to the process crafted above can have highly improved results.

Exploring Real Data vs. Simulated Data: Integrating real data with the simulated data

acquired in the project can enhance data quality further (Tayo, 2019). Obtaining real data that is

more versatile is a feat since the area of our study involves data not readily available to the

general public.

Conduct Sensitivity Analysis: It enables us to make better decisions for our data modeling and

analysis. Key predictors are identified and their impact on the results can be further explored

(Markus J. Ankenbrand, 2021).

Experimenting With Ensembles: Combining multiple satisfactory performing models by

taking their predictions mean, will result in a superior robust model that employs the skills of all

models combined (Lee, 2016).

Employing Hybrid Models: Deep learning models when integrated with probabilistic

approaches reap better model certainty (Cach N. Dang, 2021).

The research so far adds to the advancement of Power System fault detection and classification.

It truly accentuates the Long Short-Term Memory (LSTM) model’s potential for efficaciously

dealing with Time series datasets like TL fault analysis. While there are still some imposing

challenges, the overall high performance indicates the progress made to be decent and successful

in fault detection in power systems.


CHAPTER 7 REFERENCES

Ali Raza, A. B. (2020). A Review of Fault Diagnosing Methods in Power Transmission Systems. MPDI.

Anshuman Bhuyan, B. K. (n.d.). Convolutional Neural Network Based Fault Detection for Transmission

Line. international Conference on Intelligent Controller and Computing for Smart Powe, (pp. 1-4). 2022.

Cach N. Dang, M. N.-G. (2021). Hybrid Deep Learning Models for Sentiment Analysis. Hindawi.

Feifei Xu, Y. L. (2023). An improved ELM-WOA–based fault diagnosis for electric power. Frontier

Energy Res.

Fezan Rafique, L. F. (2021). End to end machine learning for fault detection and classification in power

transmission lines. Electric Power Systems Research.

Jalal Sahebkar Farkhani, M. Z. (2020). The Power System and Microgrid Protection—A Review.

Sustainable Technologies in Intensive Energy Industrial Consumers: The New Path to Carbon Neutrality.

Khaoula Assadi, J. B. (2023). Shunt faults detection and classification in electrical power transmission

line systems based on artificial neural networks. The international journal for computation and

mathematics in electrical and electronic engineering.

Lee, J. B. (2016, September 21). How To Improve Deep Learning Performance. Deep Learning

Perfromance.

M. Singh, B. K. (2011). Transmission line fault detection and classification. International Conference on

Emerging Trends in Electrical and Computer Technology, (pp. 15-22).

Majid Jamil, S. K. (2011). Fault detection and classification in electrical power transmission system.

Springer Plus.
Markus J. Ankenbrand, L. S. (2021, febuary 15). Sensitivity analysis for interpretation of machine

learning based segmentation models in cardiac MRI. BMC Medical Imaging, 21.

Prerana P. Wasnik, D. N. (2019). Fault detection and classification based on semisupervised.

International Conference on Innovative Trends and Advances in Engineering and Technology.

Qinghua Wang, Y. Y. (2020). Fault Detection and Classification in MMC-HVDC Systems Using

Learning Methods. MPDI.

R. Fan, T. Y. (2019). Transmission Line Fault Location Using Deep Learning Techniques. North

American Power Symposium, (pp. 1-5).

Sebastian Raschka, J. P. (2020). Machine Learning in Python: MaiDevelopments and Technology Trends

in Data Science, Machine Learning, and Artificial Intelligence. MPDI.

Singh, D. S. (2019). Data normalization is a very important pre-processing tool, especially with feature

extraction in deep learning. Features are scaled and transformed in to a uniform like distribution aiding

the neural networks to train and perform better with more stabili. Science Direct.

Tayo, B. O. (2019, December 13). Combining Actual Data with Simulated Data in Machine Learning.

Toward Data Science.

Venkatesh, V. (2018). Fault Classification and Location Identification on Electrical. Virginia

Commonwealth University.

Zhang, C. L. (2016). Fault classification on transmission line of 10kV rural power grid. Proceedings of

the 2015 4th International Conference on Sensors, Measurement and Intelligent Materials.

You might also like