8
8
IU2341051286
CE-C A1
Experiment 8
AIM: To predict equipment failures using IoT sensor data.
Objective: Apply machine learning for predictive maintenance using Python’s scikit-learn library.
Tools Used: Google Colab, Python libraries (pandas, numpy, scikit-learn, matplotlib, seaborn)
Theory:
Machine learning (ML) is a branch of artificial intelligence (AI) that enables systems to learn
from data and make predictions without being explicitly programmed.
1. Data Collection: IoT sensors collect data such as temperature, vibration, pressure, and
humidity.
2. Data Preprocessing: Cleaning and transforming raw data into a suitable format for
training ML models.
3. Feature Engineering: Selecting and transforming relevant sensor readings to improve
model performance.
4. Model Selection & Training: Choosing an appropriate machine learning model and
training it using historical data.
5. Model Evaluation: Assessing the model’s accuracy using evaluation metrics.
6. Failure Prediction: Deploying the trained model to predict equipment failures in real-time.
7. Decision Support: Using predictions to schedule maintenance before failure occurs.
Several machine learning models can be used to predict equipment failures. These models fall
into three main categories: supervised learning, unsupervised learning, and reinforcement
learning.
Supervised learning involves training a model on labeled data, where each sensor reading is
associated with a known failure status (0 = No Failure, 1 = Failure).
Since failure prediction is a binary classification problem (failure vs. no failure), the following
models are widely used:
● Logistic Regression: A simple statistical model that estimates the probability of failure
based on sensor readings.
● Decision Trees: A tree-like structure where each node represents a decision based on
sensor values, leading to a failure or non-failure classification.
● Random Forest: An ensemble of multiple decision trees that improves prediction
accuracy by averaging results.
● Support Vector Machines (SVM): A model that finds an optimal boundary (hyperplane)
between failure and non-failure instances.
● Neural Networks: Deep learning models that learn complex patterns in sensor data,
useful for large datasets.
Instead of predicting failure as a yes/no classification, regression models estimate the remaining
useful life (RUL) of a machine:
Unsupervised learning is useful when labeled failure data is not available. It detects anomalies
or patterns that deviate from normal operating conditions.
Feature engineering is the process of selecting and transforming raw sensor data into
meaningful inputs for machine learning models. Important features for predictive maintenance
include:
● Statistical Features: Mean, standard deviation, variance of sensor readings over time.
● Time-Series Features: Rolling averages, moving windows of sensor readings.
● Frequency-Domain Features: Fourier transforms to capture vibration patterns.
● Domain-Specific Features: Sensor thresholds based on manufacturer specifications.
Hunny Mane
IU2341051286
CE-C A1
Proper feature selection improves model accuracy and interpretability.
Once a model is trained, its effectiveness is evaluated using metrics such as:
A good predictive maintenance model should have high precision and recall, minimizing false
positives (unnecessary maintenance) and false negatives (missed failures).
Program code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report,
confusion_matrix
np.random.seed(42)
num_samples = 1000
temperature = np.random.randint(30, 100, num_samples)
vibration = np.round(np.random.uniform(0.1, 3.0, num_samples), 2) pressure
= np.random.randint(50, 400, num_samples)
humidity = np.random.randint(20, 80, num_samples)
machine_age = np.random.randint(1, 20, num_samples)
failure = np.where(
(temperature > 80) & (vibration > 2.0) & (pressure > 300) & (humidity >
60),
1,
Hunny Mane
IU2341051286
CE-C A1
np.random.choice([0, 1], size=num_samples, p=[0.85, 0.15]) )
df = pd.DataFrame({
"Temperature": temperature,
"Vibration": vibration,
"Pressure": pressure,
"Humidity": humidity,
"Machine_Age": machine_age,
"Failure": failure
})
csv_filename = "iot_sensor_data.csv"
df.to_csv(csv_filename, index=False)
df = pd.read_csv("iot_sensor_data.csv")
X = df.drop(columns=['Failure'])
y = df['Failure']
plt.figure(figsize=(5, 4))
sns.heatmap(confusion_matrix(y_test, y_pred), annot=True, fmt='d',
cmap='Blues',
xticklabels=['No Failure', 'Failure'], yticklabels=['No
Failure', 'Failure'])
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
feature_importance = pd.Series(model.feature_importances_,
index=X.columns).sort_values(ascending=False)
plt.figure(figsize=(8, 5))
Hunny Mane
IU2341051286
CE-C A1
sns.barplot(x=feature_importance, y=feature_importance.index)
plt.xlabel("Feature Importance Score")
plt.ylabel("Features")
plt.title("Feature Importance in Predictive Maintenance Model")
plt.show()
Explanation of LOGIC:
Message Flow:
Flowchart:
Start Program ↓ Load and Preprocess IoT Sensor Data ↓ Train Machine Learning Model ↓
Predict Equipment Failures ↓ Evaluate Model Performance ↓ Deploy for Predictive Maintenance
↓ End
Observation Tables:
1 81 1.94 161 56 18 0
2 44 2.23 62 61 1 0
3 90 2.56 135 36 6 0
4 50 0.46 283 56 6 0
5 53 2.64 139 73 16 0
Hunny Mane
IU2341051286
CE-C A1
Model Accuracy: 0.865
Classification Report
Classification Report
Confusion Matrix
The confusion matrix is a table that summarizes the model's predictions compared to actual
labels. It consists of True Positives (TP), True Negatives (TN), False Positives (FP), and False
Negatives (FN). True Positives and True Negatives represent correctly classified instances,
whereas False Positives (Type I Error) indicate misclassified negatives, and False Negatives
(Type II Error) represent missed failures. The confusion matrix helps visualize model accuracy
and highlights areas where the model may be misclassifying, allowing for targeted
improvements.
Hunny Mane
IU2341051286
CE-C A1
Outcome: (attach All screenshots of readable size)
0 36 1.24 205 74 13 0
1 36 0.44 311 47 2 0
2 76 2.36 222 72 19 1
3 37 1.15 72 41 19 0
4 30 2.23 110 61 19 0
Classification Report:
Conclusion: Predictive maintenance using IoT sensor data and machine learning helps prevent
equipment failures, reducing operational costs and downtime. The use of Random Forest
provides a robust predictive model with high accuracy, making it a valuable tool for industrial
applications.
Homework Assigned:
Task: Extend the program to implement a deep learning model (e.g., Neural Network) for failure
prediction.
Program code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from tensorflow import keras
from tensorflow.keras import layers
df = pd.read_csv("iot_sensor_data.csv")
X = df.drop(columns=['Failure'])
y = df['Failure']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = keras.Sequential([
layers.Dense(32, activation='relu', input_shape=(X_train.shape[1],)),
layers.Dense(16, activation='relu'),
Hunny Mane
IU2341051286
CE-C A1
layers.Dense(1, activation='sigmoid')
])
Output :
Classification Report:
matrix :
Hunny Mane
IU2341051286
CE-C A1
Observation Tables:
Conclusion :
The neural network model achieved an overall accuracy of 87%, indicating a strong
performance in predicting equipment failures based on the IoT sensor data. The precision and
recall values for both classes (failure and no failure) are well-balanced, with F1-scores of 0.87,
suggesting that the model effectively identifies both classes without significant bias. The
confusion matrix reveals that the model has a good ability to minimize false positives and false
negatives, making it a reliable tool for predictive maintenance.