0% found this document useful (0 votes)
3 views

ICT Assignment

The document discusses various machine learning techniques for fault classification, signal processing, power demand prediction, and lifespan estimation of transformers. It outlines supervised and unsupervised learning methods, data collection strategies, feature extraction, and evaluation metrics for different applications in power systems. Additionally, it highlights challenges and strategies for effective data management and model implementation.

Uploaded by

shaheeralik2005
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

ICT Assignment

The document discusses various machine learning techniques for fault classification, signal processing, power demand prediction, and lifespan estimation of transformers. It outlines supervised and unsupervised learning methods, data collection strategies, feature extraction, and evaluation metrics for different applications in power systems. Additionally, it highlights challenges and strategies for effective data management and model implementation.

Uploaded by

shaheeralik2005
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

NAME: Muhammad Shaheer Ali Khan

CMS: 509801

BEE-16 D

SUBMITTED TO :Dr Syed Taha Ali

DATE: 21-12-2024

Page 1 of 15
Question 1
(1)

Supervised learning can classify faults in a power grid by Training a model on labeled data i.e.;
under normal and fault conditions. Once trained, the model can classify new sensor readings as
normal condition, short circuit, line breakage.

(2)
Input:
1. Voltage
2. Current
3. Frequency
Output:
1. Line Breakage
2. Short Circuit

(3)

1. Simulation of Data:
Use physics-based grid simulations or create synthetic sensor readings and for each feature
(voltage, current, frequency), generate normal and fault-condition values.
2. Labeling of Data:
Assign labels based on predefined thresholds (e.g., voltage < X, current > Y for a short circuit).
Challenge:
3. Noisy Sensor Data:
Add realistic noise to mimic sensor inaccuracies, using random perturbations or Gaussian noise.

Page 2 of 15
(4)
1. Collection of Data:
Gather real-time sensor data and Label events based on operator logs or simulations.
2. Data Preprocessing:
Filter noise, normalize readings, and extract features like rate of change.
3. Dividing the Data into Training and Testing Set:
Assign 70%-80% of the dataset for training and 20%-30% for testing.
4. Model Training:
Select and train a Decision Tree or Neural Network model.
5. Prediction using the Trained Model:
Detect faults in real-time based on new sensor readings.
6. Model Evaluation:
Precision and F1-score for each fault type.

Data Set:

Voltage (V) Current (I) Frequency (Hz) Fault Type


230 5 50 Normal
180 50 50 Short
0 0 50 Breakage
230 6 50 Normal
190 45 50 Short
0 0 50 Breakage

Page 3 of 15
QUESTION 2
(1)

The possible features that could be extracted from signals may be:

1. Time-Domain Features:
 Amplitude Variations: Signal strength over time (important for AM).
 Instantaneous Frequency: Rate of phase change (key for FM).
 Energy: Signal power in time windows.
2. Frequency-Domain Features:
 Spectral Density: Energy distribution across frequencies.
 Bandwidth: Width of the signal’s frequency spectrum.
 Harmonics: Key for AM and FM signals.
3. Statistical Features:
 Mean & Variance: Amplitude and frequency properties.

(2)

1. Data Collection

Use a signal generator to simulate signals in AM, FM, and PSK schemes under various conditions
(e.g., noise, fading). Collect real-world signals from communication channels if possible.

2. Data Preprocessing
 Filtering: Remove noise using low-pass or band-pass filters.
 Segmentation: Divide the signal into fixed-length segments.
 Normalization: Standardize feature values for consistency.
3. Feature Extraction

Compute features such as amplitude, phase, frequency, spectral density, etc.

4. Data Splitting

Split the dataset into Training Set (70%) to train the model, Validation Set (15%) For
hyperparameter tuning, Test Set (15%) to evaluate model performance.

Page 4 of 15
5. Model Training

Use any of the supervised learning models like decision tree or svm to train the model.

6. Model Validation

Using of k-fold cross-validation to ensure model generalization.

(3)

The evaluation metric accuracy can be used for the balanced data set i.e; where all the
modulations are equally represented while the metric F-1 score can be used for the imbalanced
dataset which can balance precision . Here, F-1 score will be preferred because modulations
may have different prevalence e.g; sometimes FM could dominate so in this type of case F-1
Score ensures performance is evaluated fairly across all classes.

DATA SET:

Dataset Name Description Modulation Signal Data Size


Types Characteristics
RadioML A dataset for AM, FM, PSK, Time-domain, 200,000 samples
modulation QAM, etc. Frequency-
classification domain,
including AM, Statistical
FM, PSK signals features
CICIDS Signal Time-domain,
classification for frequency
wireless AM, PSK, FSK, features 1,000,000
communication etc. packets
in security
scenarios
GNU Radio Generated AM, FM, PSK, Time-domain,
Synthetic synthetic signals etc. Spectral Density,
with Harmonics
Custom samples
modulations like
AM, FM, PSK,
etc.

Page 5 of 15
QUESTION 3
(1)
Regression techniques can help predict power demand for the next 24 hours by:
1. Time Series Forecasting: Regression models like Linear Regression, Polynomial Regression,
capture trends and patterns from historical data to predict future demand.
2. Trend Prediction: Regression techniques can use time-related features to predict the next
24 hours of demand.
3. Multiple Linear Regression: Incorporating factors like temperature or economic activity can
improve accuracy by considering multiple predictors.

(2)
Three features that might influence power consumption are:

1. Temperature (Weather): Higher temperatures can increase demand due to the use of air
conditioning in hot weather, while colder temperatures can increase heating demand.
Weather patterns can significantly impact power consumption.
2. Day of the Week/Time of Day: Power consumption typically varies by the time of day
(e.g., higher during working hours, lower at night) and by day of the week (e.g., weekdays
vs. weekends).
3. Holiday or Special Events: During holidays or events (e.g., festivals, public holidays), power
consumption patterns can shift due to increased or decreased activity. Holidays can result
in changes like more people staying home or businesses closing, affecting overall demand.

(3)
Temporal trends like seasonal patterns or holidays could be included in the model as the
electricity demand often follows seasonal trends, with higher consumption in the summer (due
to cooling needs) or winter (due to heating). These seasonal trends can be included by adding a
seasonality feature or using sinusoidal encoding to represent cyclical patterns. Similarly, the
public holidays can lead to deviations from the typical consumption pattern. To include this, a
binary holiday indicator (1 for holidays, 0 otherwise) or a public holiday schedule can be added
as features.

Page 6 of 15
DATA SET:

Dataset Name Description Features


UCI Machine Learning Contains hourly electricity Date, time, global active
Repository - Individual consumption data for power, voltage, sub-
Household Electric Power households, including metering, power usage
Consumption Dataset features like date, time, and
power usage.
Electricity Consumption Dataset for forecasting
Forecasting Dataset electricity consumption Hourly power usage,
based on historical usage, temperature, humidity, time
potentially including weather of day, day of week, holiday
data and time-based indicators.
features.
ENST Dataset Includes power demand data Power consumption,
along with environmental temperature, humidity, time-
data like temperature, which related features (hour,
can be used to predict weekday).
electricity demand.

Page 7 of 15
QUESTION 4
(1)

One of the Unsupervised Learning type is Clustering. Clustering methods such as K-Means,
analyze user data without predefined categories. They automatically group users based on
similarities in their consumption patterns. Moreover, Clustering can reveal distinct patterns
among industrial and residential users:

 Industrial Users: Characterized by high and consistent electricity usage, often with
predictable peak times.
 Residential Users: Typically show more variability, with patterns influenced by time of
day, season, or household size

(2)

The potential features for clustering may be:

1. Average Daily/Hourly Consumption i.e; Captures overall consumption patterns.


2. Peak Usage Hours i.e; Identifies when users consume the most electricity.
3. Seasonal Variability i.e; Measures how consumption changes with seasons.
4. Total Monthly Consumption i.e; Distinguishes high-use industrial users from low-use residential
ones.

(3)

An evaluation method to assess the quality of the clusters may be Silhouette Score. It Measures
how similar users within the same cluster are, compared to those in other clusters. A higher
score indicates well-separated clusters

Page 8 of 15
(4)

These Patterns Could Help Optimize Energy Distribution and Tariff Plans in the way that it
groups the heavy users, moderate users and light users in the following way :

 For Heavy Users Offer discounts for high volumes or incentives for peak-time reductions.
 For Moderate Users Introduce time-of-use tariffs to encourage off-peak usage.
 For Light Users: Provide flat-rate or low-cost tariffs to retain users with low consumption.

DATA SET:

Dataset Name Description Features Source


Smart Meter Energy Hourly electricity usage data Date, time, hourly usage,
Consumption Data collected from residential user type, peak hours,
and commercial users. seasonal variation.
Electricity Consumption Data Aggregated consumption Hourly/monthly
data for industrial and consumption, temperature,
residential users with weekday/weekend indicators,
temporal features. user type, billing data.
Irish Smart Meter Electricity Dataset from residential and Date, time, user type, energy
Data small business users in usage, peak/off-peak usage,
Ireland with detailed and cost data.
consumption trends.

Page 9 of 15
QUESTION 5
(1)

Both supervised and unsupervised learning can be used in this type of problem.

Supervised Learning:

Supervised models predict energy availability from renewable sources based on historical data
and weather conditions. For example Regression Models (e.g., Linear Regression, Neural
Networks) forecast solar and wind power output also the Classification Models (e.g., Decision
Trees) predict grid statuses such as "normal" or "overloaded."

Unsupervised Learning:

Used for clustering consumption patterns or detecting anomalies in the grid. Examples
Clustering Algorithms (e.g., K-Means) segment regions or customers based on energy usage
profiles.

(2)

The expected features and their relevance can be:

1. Feature: Weather Data

Relevance: Solar irradiance and wind speed are critical for renewable energy prediction.

2. Feature: Historical Generation

Relevance: Helps capture trends in solar and wind power output for better forecasting

3. Feature: Time Features

Relevance: Seasonal, daily, and hourly patterns influence both energy demand and renewable
supply

4. Feature: Energy Demand Data

Relevance: Necessary to predict peak loads and manage supply-demand balance

Page 10 of 15
(3)

Let us consider an scenario for the predictions or discovered patterns that could help in real-
time.

Scenario:
A solar farm experiences reduced output due to sudden cloud cover, while demand increases
during evening peak hours.

Action:

The supervised learning model predicts the shortfall in solar output. The system automatically
switches to wind power or discharges energy from batteries. Anomaly detection (unsupervised)
monitors for unusual grid conditions during the transition.

Output:

Prevents outages by maintaining a stable energy supply. Optimizes cost by using renewable
sources and reducing reliance on costly backup generators.

(4)
Expected Challenges :

1. Solar and wind outputs are highly weather-dependent, making accurate predictions
challenging.
2. Models rely on extensive historical and real-time data, which may be incomplete or
noisy.
3. Grid operations require immediate decisions, demanding fast and scalable machine
learning algorithms.
4. Integrating machine learning systems and IoT devices (e.g., smart meters) incurs
significant upfront investment.

Page 11 of 15
DATA SET:
Dataset Name Description Features Source
Open Power System Data Dataset for European grid Weather data, energy
operations, including renewable production, load data, and
energy generation and demand. storage levels.
Renewable Energy Forecasting Renewable energy datasets for Solar irradiance, wind speed,
wind and solar generation temperature, humidity, and
forecasting. energy production.
UCI Smart Grid Stability Dataset Dataset for predicting grid Historical generation, load,
stability using renewable energy weather data, and grid stability
and consumption data status.

Page 12 of 15
QUESTION 6
(1)
Regression models predict the remaining lifespan of transformers by identifying relationships
between historical data and failure times. The Approach which can be used is the use
supervised learning with historical transformer data (input features) and their recorded
lifespans as output labels. Also the Algorithms like Linear Regression, Polynomial Regression, or
advanced models such as Random Forest Regressors or Neural Networks can model these
relationships effectively.

(2)
The five factors (features) that could influence component lifespan and describe their
importance may be:

1. Factor: Operating Temperature

Importance: Excessive heat accelerates insulation breakdown, reducing transformer lifespan

2. Factor: Load Cycles

Importance: Frequent load changes cause mechanical stress, leading to wear and tear over time

3. Factor: Fault History

Importance: Past faults (e.g., short circuits) weaken components and shorten their operational
life

4. Factor: Humidity and Moisture Level

Importance: High moisture levels degrade insulation and increase risk of electrical failure

5. Factor: Oil Quality

Importance: Degraded oil loses its cooling and insulation properties, directly impacting
transformer durability

Page 13 of 15
(3)
The required consequences may be:

1. Underestimating Lifespan:

Prematurely replacing transformers leads to unnecessary costs and wasted resources.

2. Overestimating Lifespan:

Delayed maintenance could result in unexpected failures, causing The Substation outages and
power supply interruptions, Damage to other grid components due to cascading failures and
also Increased downtime and repair costs.

3. Impact on Reliability:

Unreliable lifespan predictions undermine the confidence of operators in the predictive


maintenance system, potentially leading to inefficiencies.

(4)

Strategy for Data Collection :

1. Install Sensors:

Equip transformers with IoT sensors to continuously monitor key parameters such as
temperature, load, and vibration.

2. Fault Recording:

Maintain detailed logs of all faults, including type, severity, and resolution times.

3. Regular Inspections:

Periodically assess transformers for physical wear, oil quality, and insulation degradation.

4. Historical Data:

Gather and digitize maintenance and operational records to establish trends over time.

5. Environmental Monitoring:

Collect data on ambient conditions (e.g., temperature, humidity) that could impact
performance.

Page 14 of 15
6. Data Validation:

Use automated checks and manual reviews to ensure the data is accurate and free of
inconsistencies or missing values.

DATA SET:
Dataset Name Description Feature Source
IEEE Transformer Dataset A dataset of transformer fault Operating temperature, load
and operational data, cycles, fault history, vibration,
including temperature, load, environmental data
and fault history (humidity, dust, etc.).
UCI Predictive Maintenance Data used for predictive Fault history, operational
Dataset maintenance of electrical stress, temperature data,
components, including vibration levels, maintenance
transformer components logs.
Power Transformer Life Data used for predictive Load cycles, fault history,
Expectancy Dataset maintenance of electrical operating temperature,
components, including humidity, maintenance
transformer components. history.

Page 15 of 15

You might also like