Machine Failure Prediction
Machine Failure Prediction
October 3, 2024
Project Overview:
I was worked a model to anticipate machine failure . The first issue was locating an appropriate
dataset given by Teachnook. I finally located a dataset related to machine failure detection, which
is really a classification problem, after few hours of focused searching.
Introduction
Predicting machine failure is a crucial component of engineering for dependability and maintenance.
Through early detection of Machine Failure, companies can cut expenses, minimize lost produc-
tion, and enhance overall productivity. We’ll go over the main ideas, methods, and recommended
practices for machine failure prediction in this Project.
Why Is It Important to Predict Machine Failure?
• Cost Savings : Unexpected equipment failure can result in expensive repairs, slowed output,
and lost income. By enabling prompt responses, predictive maintenance aids in the prevention
of these problems.
• Safety : Operators and other personnel may be in danger of injury or death if a machine
malfunctions. Proactive steps to reduce these risks are made possible by the ability to predict
failures in advance.
• Enhanced Upkeep : Predictive maintenance concentrates on particular equipment situations
as opposed to fixed schedules, which may be inefficient. Maintenance efforts are optimised by
this focused strategy.
Techniques for Machine Failure Prediction
1. Condition Monitoring :
Collect data from sensors (such as vibration, temperature, and pressure) mounted on equipment.
Analyze the data to identify any anomalies or departures from normal behavior.
Use statistical approaches, machine learning, or deep learning algorithms to forecast failures.
2. Failure Mode and Effects Analysis (FMEA) :
Identify the possible failure modes for each machine component.
Determine the severity, occurrence, and detectability of each failure mode.
Prioritise actions according to risk assessment.
Prognostics and Health Management (PHM) uses real-time and historical data to estimate remain-
ing usable life (RUL).
1
PHM models anticipate when a component will fail based on its deterioration trend.
Data requirements
Historical data :
Gather historical data on machine performance, maintenance actions, and failures. Include time
stamps, sensor readings, and maintenance logs.
Feature Engineering :
Extract relevant features (e.g. rolling averages, statistical moments) from raw sensor data. Consider
domain-specific knowledge to create meaningful work.
Go deeper into the code
Here, I share the code and methods I used to predict Machine failure in my project. This includes
data processing, engineering, model training, and evaluation. The components and code can be
found in the Jupyter Notebook linked below.
[3]: import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import numpy as np
from google.colab import drive
from sklearn.preprocessing import StandardScaler
#data splitting
from sklearn.model_selection import train_test_split
#data modeling
from sklearn.metrics import␣
↪confusion_matrix,accuracy_score,roc_curve,classification_report
Load Dataset :
[7]: #load dataset
df=pd.read_csv('/content/machine failure.csv')
df.head()
2
[8]: tempMode 0
AQ 0
USS 0
CS 0
VOC 0
RP 0
IP 0
Temperature 0
fail 0
dtype: int64
[9]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 944 entries, 0 to 943
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 tempMode 944 non-null int64
1 AQ 944 non-null int64
2 USS 944 non-null int64
3 CS 944 non-null int64
4 VOC 944 non-null int64
5 RP 944 non-null int64
6 IP 944 non-null int64
7 Temperature 944 non-null int64
8 fail 944 non-null int64
dtypes: int64(9)
memory usage: 66.5 KB
Data Visualization
[10]: sns.countplot(x='CS',data=df)
plt.show
3
[11]: cor_matrix=df.corr()
sns.heatmap(cor_matrix,annot=True,cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()
4
[12]: sns.pairplot(df.drop(['AQ','USS','VOC','IP',],axis=1),hue='CS')
plt.show()
5
[13]: plt.figure(figsize=(20,15))
m=1
for i in ['tempMode','RP','Temperature','fail']:
plt.subplot(3,2,m)
sns.boxplot(x='USS',y=i,data=df,hue="CS")
m+=1
6
[30]: plt.figure(figsize=(20,15))
m=1
for i in ['tempMode','RP','Temperature','fail']:
plt.subplot(3,2,m).set_title(label=("Possibility of failure wrt "+i))
sns.barplot(x='USS',y=i,data=df,hue="CS")
m+=1
7
[32]: from sklearn.preprocessing import LabelEncoder
label_encoder=LabelEncoder()
df['tempMode']=label_encoder.fit_transform(df['tempMode'])
Model Training :
[34]: lr=LogisticRegression()
lr.fit(x_train,y_train)
y_pred=lr.predict(x_test)
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))
print(accuracy_score(y_test,y_pred))
[[141 19]
[ 15 109]]
precision recall f1-score support
0.8802816901408451
8
[37]: from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier,GradientBoostingClassifier
from sklearn.svm import SVC
# Decision Tree
dt=DecisionTreeClassifier()
dt.fit(x_train,y_train)
y_pred=dt.predict(x_test)
accuracy_dt = accuracy_score(y_test,y_pred)
print("Decision Tree Accuracy:",accuracy_dt)
#Random Forest
rf=RandomForestClassifier()
rf.fit(x_train,y_train)
y_pred_rf=rf.predict(x_test)
accuracy_rf = accuracy_score(y_test,y_pred_rf)
print("Random Forest Accuracy:",accuracy_rf)
9
#Gradient Boosting
gb=GradientBoostingClassifier()
gb.fit(x_train,y_train)
y_pred_gb=gb.predict(x_test)
accuracy_gb = accuracy_score(y_test,y_pred_gb)
print("Gradient Boosting Accuracy:",accuracy_gb)
plt.figure(figsize=(10, 6))
plt.plot(train_sizes, np.mean(train_scores, axis=1), 'o-', label='Training␣
↪Accuracy')
plt.xlabel('Training Examples')
plt.ylabel('Accuracy')
plt.title('Learning Curve')
plt.legend(loc='best')
plt.grid(True)
plt.show()
10
11