ICT Assignment
ICT Assignment
CMS: 509801
BEE-16 D
DATE: 21-12-2024
Page 1 of 15
Question 1
(1)
Supervised learning can classify faults in a power grid by Training a model on labeled data i.e.;
under normal and fault conditions. Once trained, the model can classify new sensor readings as
normal condition, short circuit, line breakage.
(2)
Input:
1. Voltage
2. Current
3. Frequency
Output:
1. Line Breakage
2. Short Circuit
(3)
1. Simulation of Data:
Use physics-based grid simulations or create synthetic sensor readings and for each feature
(voltage, current, frequency), generate normal and fault-condition values.
2. Labeling of Data:
Assign labels based on predefined thresholds (e.g., voltage < X, current > Y for a short circuit).
Challenge:
3. Noisy Sensor Data:
Add realistic noise to mimic sensor inaccuracies, using random perturbations or Gaussian noise.
Page 2 of 15
(4)
1. Collection of Data:
Gather real-time sensor data and Label events based on operator logs or simulations.
2. Data Preprocessing:
Filter noise, normalize readings, and extract features like rate of change.
3. Dividing the Data into Training and Testing Set:
Assign 70%-80% of the dataset for training and 20%-30% for testing.
4. Model Training:
Select and train a Decision Tree or Neural Network model.
5. Prediction using the Trained Model:
Detect faults in real-time based on new sensor readings.
6. Model Evaluation:
Precision and F1-score for each fault type.
Data Set:
Page 3 of 15
QUESTION 2
(1)
The possible features that could be extracted from signals may be:
1. Time-Domain Features:
Amplitude Variations: Signal strength over time (important for AM).
Instantaneous Frequency: Rate of phase change (key for FM).
Energy: Signal power in time windows.
2. Frequency-Domain Features:
Spectral Density: Energy distribution across frequencies.
Bandwidth: Width of the signal’s frequency spectrum.
Harmonics: Key for AM and FM signals.
3. Statistical Features:
Mean & Variance: Amplitude and frequency properties.
(2)
1. Data Collection
Use a signal generator to simulate signals in AM, FM, and PSK schemes under various conditions
(e.g., noise, fading). Collect real-world signals from communication channels if possible.
2. Data Preprocessing
Filtering: Remove noise using low-pass or band-pass filters.
Segmentation: Divide the signal into fixed-length segments.
Normalization: Standardize feature values for consistency.
3. Feature Extraction
4. Data Splitting
Split the dataset into Training Set (70%) to train the model, Validation Set (15%) For
hyperparameter tuning, Test Set (15%) to evaluate model performance.
Page 4 of 15
5. Model Training
Use any of the supervised learning models like decision tree or svm to train the model.
6. Model Validation
(3)
The evaluation metric accuracy can be used for the balanced data set i.e; where all the
modulations are equally represented while the metric F-1 score can be used for the imbalanced
dataset which can balance precision . Here, F-1 score will be preferred because modulations
may have different prevalence e.g; sometimes FM could dominate so in this type of case F-1
Score ensures performance is evaluated fairly across all classes.
DATA SET:
Page 5 of 15
QUESTION 3
(1)
Regression techniques can help predict power demand for the next 24 hours by:
1. Time Series Forecasting: Regression models like Linear Regression, Polynomial Regression,
capture trends and patterns from historical data to predict future demand.
2. Trend Prediction: Regression techniques can use time-related features to predict the next
24 hours of demand.
3. Multiple Linear Regression: Incorporating factors like temperature or economic activity can
improve accuracy by considering multiple predictors.
(2)
Three features that might influence power consumption are:
1. Temperature (Weather): Higher temperatures can increase demand due to the use of air
conditioning in hot weather, while colder temperatures can increase heating demand.
Weather patterns can significantly impact power consumption.
2. Day of the Week/Time of Day: Power consumption typically varies by the time of day
(e.g., higher during working hours, lower at night) and by day of the week (e.g., weekdays
vs. weekends).
3. Holiday or Special Events: During holidays or events (e.g., festivals, public holidays), power
consumption patterns can shift due to increased or decreased activity. Holidays can result
in changes like more people staying home or businesses closing, affecting overall demand.
(3)
Temporal trends like seasonal patterns or holidays could be included in the model as the
electricity demand often follows seasonal trends, with higher consumption in the summer (due
to cooling needs) or winter (due to heating). These seasonal trends can be included by adding a
seasonality feature or using sinusoidal encoding to represent cyclical patterns. Similarly, the
public holidays can lead to deviations from the typical consumption pattern. To include this, a
binary holiday indicator (1 for holidays, 0 otherwise) or a public holiday schedule can be added
as features.
Page 6 of 15
DATA SET:
Page 7 of 15
QUESTION 4
(1)
One of the Unsupervised Learning type is Clustering. Clustering methods such as K-Means,
analyze user data without predefined categories. They automatically group users based on
similarities in their consumption patterns. Moreover, Clustering can reveal distinct patterns
among industrial and residential users:
Industrial Users: Characterized by high and consistent electricity usage, often with
predictable peak times.
Residential Users: Typically show more variability, with patterns influenced by time of
day, season, or household size
(2)
(3)
An evaluation method to assess the quality of the clusters may be Silhouette Score. It Measures
how similar users within the same cluster are, compared to those in other clusters. A higher
score indicates well-separated clusters
Page 8 of 15
(4)
These Patterns Could Help Optimize Energy Distribution and Tariff Plans in the way that it
groups the heavy users, moderate users and light users in the following way :
For Heavy Users Offer discounts for high volumes or incentives for peak-time reductions.
For Moderate Users Introduce time-of-use tariffs to encourage off-peak usage.
For Light Users: Provide flat-rate or low-cost tariffs to retain users with low consumption.
DATA SET:
Page 9 of 15
QUESTION 5
(1)
Both supervised and unsupervised learning can be used in this type of problem.
Supervised Learning:
Supervised models predict energy availability from renewable sources based on historical data
and weather conditions. For example Regression Models (e.g., Linear Regression, Neural
Networks) forecast solar and wind power output also the Classification Models (e.g., Decision
Trees) predict grid statuses such as "normal" or "overloaded."
Unsupervised Learning:
Used for clustering consumption patterns or detecting anomalies in the grid. Examples
Clustering Algorithms (e.g., K-Means) segment regions or customers based on energy usage
profiles.
(2)
Relevance: Solar irradiance and wind speed are critical for renewable energy prediction.
Relevance: Helps capture trends in solar and wind power output for better forecasting
Relevance: Seasonal, daily, and hourly patterns influence both energy demand and renewable
supply
Page 10 of 15
(3)
Let us consider an scenario for the predictions or discovered patterns that could help in real-
time.
Scenario:
A solar farm experiences reduced output due to sudden cloud cover, while demand increases
during evening peak hours.
Action:
The supervised learning model predicts the shortfall in solar output. The system automatically
switches to wind power or discharges energy from batteries. Anomaly detection (unsupervised)
monitors for unusual grid conditions during the transition.
Output:
Prevents outages by maintaining a stable energy supply. Optimizes cost by using renewable
sources and reducing reliance on costly backup generators.
(4)
Expected Challenges :
1. Solar and wind outputs are highly weather-dependent, making accurate predictions
challenging.
2. Models rely on extensive historical and real-time data, which may be incomplete or
noisy.
3. Grid operations require immediate decisions, demanding fast and scalable machine
learning algorithms.
4. Integrating machine learning systems and IoT devices (e.g., smart meters) incurs
significant upfront investment.
Page 11 of 15
DATA SET:
Dataset Name Description Features Source
Open Power System Data Dataset for European grid Weather data, energy
operations, including renewable production, load data, and
energy generation and demand. storage levels.
Renewable Energy Forecasting Renewable energy datasets for Solar irradiance, wind speed,
wind and solar generation temperature, humidity, and
forecasting. energy production.
UCI Smart Grid Stability Dataset Dataset for predicting grid Historical generation, load,
stability using renewable energy weather data, and grid stability
and consumption data status.
Page 12 of 15
QUESTION 6
(1)
Regression models predict the remaining lifespan of transformers by identifying relationships
between historical data and failure times. The Approach which can be used is the use
supervised learning with historical transformer data (input features) and their recorded
lifespans as output labels. Also the Algorithms like Linear Regression, Polynomial Regression, or
advanced models such as Random Forest Regressors or Neural Networks can model these
relationships effectively.
(2)
The five factors (features) that could influence component lifespan and describe their
importance may be:
Importance: Frequent load changes cause mechanical stress, leading to wear and tear over time
Importance: Past faults (e.g., short circuits) weaken components and shorten their operational
life
Importance: High moisture levels degrade insulation and increase risk of electrical failure
Importance: Degraded oil loses its cooling and insulation properties, directly impacting
transformer durability
Page 13 of 15
(3)
The required consequences may be:
1. Underestimating Lifespan:
2. Overestimating Lifespan:
Delayed maintenance could result in unexpected failures, causing The Substation outages and
power supply interruptions, Damage to other grid components due to cascading failures and
also Increased downtime and repair costs.
3. Impact on Reliability:
(4)
1. Install Sensors:
Equip transformers with IoT sensors to continuously monitor key parameters such as
temperature, load, and vibration.
2. Fault Recording:
Maintain detailed logs of all faults, including type, severity, and resolution times.
3. Regular Inspections:
Periodically assess transformers for physical wear, oil quality, and insulation degradation.
4. Historical Data:
Gather and digitize maintenance and operational records to establish trends over time.
5. Environmental Monitoring:
Collect data on ambient conditions (e.g., temperature, humidity) that could impact
performance.
Page 14 of 15
6. Data Validation:
Use automated checks and manual reviews to ensure the data is accurate and free of
inconsistencies or missing values.
DATA SET:
Dataset Name Description Feature Source
IEEE Transformer Dataset A dataset of transformer fault Operating temperature, load
and operational data, cycles, fault history, vibration,
including temperature, load, environmental data
and fault history (humidity, dust, etc.).
UCI Predictive Maintenance Data used for predictive Fault history, operational
Dataset maintenance of electrical stress, temperature data,
components, including vibration levels, maintenance
transformer components logs.
Power Transformer Life Data used for predictive Load cycles, fault history,
Expectancy Dataset maintenance of electrical operating temperature,
components, including humidity, maintenance
transformer components. history.
Page 15 of 15