0% found this document useful (0 votes)

46 views

Final Project Report

Uploaded by

Lavanya N

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views

Final Project Report

Uploaded by

Lavanya N

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 62

VISVESVARAYATECHNOLOGICALUNIVERSITY

BELAGAVI

Project Report on

“PREDICTIVE MAINTENANCE FOR AUTOMOBILES”

Submitted in the partial fulfillment for the requirements of the degree of

BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE AND ENGINEERING
Submitted By

AISHWARYA 1BY20CS015

ANANYA BARATH 1BY20CS022

KUSHI L 1BY20CS087

LAVANYA N 1BY20CS094

Under the guidance of

Dr SATISH KUMAR T
ASSOCIATE PROFESSOR
Department of CSE,
BMSIT&M

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

BMS INSTITUTE OF TECHNOLOGY & MANAGEMENT

YELAHANKA, BENGALURU - 560064.

2023-2024
VISVESVARAYATECHNOLOGICALUNIVERSITY
BELAGAVI
BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT
YELAHANKA, BENGALURU – 560064

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CERTIFICATE

This is to certify that the Project work entitled “Predictive Maintenance for
Automobiles” is a bonafide work carried out by Aishwarya(1BY20CS015), Ananya
Barath(1BY20CS022), Kushi L(1BY20CS087), Lavanya N(1BY20CS094), in partial
fulfillment for the award of Bachelor of Engineering Degree in Computer Science
and Engineering of the Visvesvaraya Technological University, Belagavi during the year
2023-2024. It is certified that all corrections/suggestions indicated for Internal Assessment
have been incorporated in this report. The project report has been approved as it satisfies the
academic requirements in respect of project work for B.E. Degree.

Signature of the Guide Signature of the HOD Signature of

Principal
Dr Satish Kumar T Dr. Thippeswamy G Dr. Sanjay H. A
Associate Professor Professor & HOD, Principal, BMSIT&M
Dept. of CSE, BMSIT&M Dept. of CSE, BMSIT&M

External VIVA-VOCE

Name of the Examiners Signature with Date

2.
ACKNOWLEDGEMENT

We are pleased to present this project report upon its successful completion. This
project would not have been possible without the guidance, assistance, and suggestions of
many individuals. We would like to express our deep sense of gratitude to each and every
person who has contributed to making this project a success.

First and foremost, we extend our heartfelt thanks to Dr. Sanjay H. A, Principal,
BMS Institute of Technology & Management, for his constant encouragement and
inspiration in undertaking this project.

We also express our sincere gratitude to Dr. Thippeswamy G, Professor and Head
of the Department, Computer Science and Engineering, BMS Institute of Technology &
Management, for his unwavering support and motivation throughout this endeavor.

We are grateful to our project guide, Dr Satish Kumar T, Associate Professor,

Department of Computer Science and Engineering, for their invaluable encouragement
and guidance throughout the project.

We extend our heartfelt thanks to our project coordinators, Dr. Vidya R Pai and
Dr. Arunakumari B. N, Assistant Professor(s), Department of Computer Science and
Engineering, for their constant support and valuable advice during the project.

Special thanks are due to all the staff members of the Computer Science and
Engineering Department for their help and cooperation.

Finally, we would like to express our gratitude to our parents and friends for their
unwavering encouragement and support throughout the duration of this project.

By,
Aishwarya 1BY20CS015
Ananya Barath 1BY20CS022
Kushi L 1BY20CS087
Lavanya N 1BY20CS094
DECLARATION

We Aishwarya(1BY20CS015), Ananya Barath(1BY20CS022), Kushi L(1BY20CS087),

Lavanya N(1BY20CS094), students of Eight semester B. E, in the Department of Computer
Science and Engineering, BMS Institute of Technology and Management, Bengaluru declare
that the project work entitled “Predictive Maintenance for Automobiles” has been carried out
by us and submitted in partial fulfilment of the course requirements for the award of degree in
Bachelor of Engineering in Computer Science and Engineering of Visvesvaraya
Technological University, Belagavi during the academic year 2023 - 2024. The matter
embodied in this report has not been submitted to any other university or institution for the
award of any other degree or diploma.

Aishwarya 1BY20CS015

Ananya Barath 1BY20CS022

Kushi L 1BY20CS087

Lavanya N 1BY20CS094
ABSTRACT

Predictive maintenance (PdM) in the automobile sector assists firms in determining when a
machine or vehicle part requires servicing by applying techniques such as data mining, data
preprocessing, and machine learning. Predictive maintenance for vehicles is a vehicle
maintenance strategy that uses digital data analysis and machine learning algorithms to predict
future breakdowns of vehicle components and systems. It is transforming the automobile
industry through the use of machine learning technologies. Machine learning models may
assess historical data, monitor real-time information, and predict probable component failures
or maintenance needs by combining data from numerous sensors and systems within cars. This
predictive method has various benefits, including reduced downtime, minimized repair costs,
and increased overall safety for both drivers and passengers.

Engine condition prediction uses supervised learning approaches to classify engines as

excellent or bad based on sensor inputs such as engine RPM, oil pressure, and coolant
temperature. The proposed approach tries to predict probable engine faults early, minimizing
downtime and maintenance expenses. Similarly, regression techniques are used to estimate
battery RUL, which predicts the remaining lifespan based on consumption and performance
parameters. Forecasting battery RUL facilitates timely maintenance measures, optimizes
resource consumption, and extends battery life. The emphasis on data preprocessing, feature
engineering, and model validation guarantees that predictive models are reliable and accurate.
This initiative adds to proactive maintenance methods by combining modern machine learning
algorithms and domain expertise, hence improving operational efficiency in industrial settings.
TABLE OF CONTENTS

Chapter 1:Introduction………………………………………………………………………1

1.1 Background…………………………………………………………………..…1

1.2 Literature Survey…………………………………………………………….....2

1.3 Motivation………………………………………………………………………4
1.4 Problem Statement…………………………………………………………...…5
1.5 Aim and Objective………………………………………………………………5
1.6 Scope…………………………………………………………………………….6
1.7 Challenges……………………………………………………………………….6
Chapter 2: Overview……………………………………………………………………….....7

Chapter 3: Requirement Specification………………………………………………………8

3.1 Mapping of Requirements……………………………………………………....8

3.2 Functional Requirements………………………………………………………..9

3.3 Non-Functional Requirements…………………………………………………..9

3.4 User Requirements……………………………………………………………..10

3.5 Domain Requirements………………………………………………………….10

3.6 System Requirements…………………………………………………………..11

Chapter 4: Detailed Design………………………………………………………………….12

4.1 System Architecture…………………………………………………………….12

4.2 System Design………………………………………………………………….13

4.3 Activity Diagram……………………………………………………………….15

4.4 Use Case Diagram………………………………………………………………16

4.5 Sequence Diagram……………………………………………………………...17

4.6 Data Flow Diagram……………………………………………………………..18

Chapter 5: Implementation………………………………………………………………….20

5.1 Programming Language………………………………………………………...20

5.2 Algorithms……………………………………………………………………...22

5.3 Proposed Design……………………………………………………...………...27

5.4 Code…………………………………………………………………….............28

Chapter 6: Testing…………………………………………………………………………....33

6.1 Testing in Machine Learning…………………………...………………………33

6.2 Testing Objectives…………………………………………………….………..34

6.3 Test Phases….………………………………………………………………….35

6.4 Test Cases………………………………………………………………………38

Chapter 7: Experimental Results…………………………………………………………...41

7.1 Snapshot of battery dataset…………………………………...………………..41

7.2 Statistics of battery dataset………………..…………………………………...41

7.3 Regression results in algorithm………………………………………………..42

7.4 Bagging Regressor…………………….……………………………………....42

7.5 Random Forest Regressor………………………………………………….…..43

7.6 Snapshot of engine dataset………………………………………………….....43

7.7 Statistics of engine dataset ………………………………………………........44

7.8 Feature Engineering……………………………………………………...........44

7.9 Random Forest Classifier………………………………………………………45

7.10 XGBoost…………………………………………………………………..…46

Chapter 8: Conclusion……………………………………………………………………....47

Chapter 9: Future Enhancements…………………………………………………….…..48

References…………………………………………………………………………….……..49
LIST OF FIGURES

Figure 4.1 System Architecture………………………………………………………………..12

Figure 4.2 System Design…….……………………………………………………………….13

Figure 4.3 Activity Diagram…………………………………………………………………..15

Figure 4.4 Use Case Diagram…………………………………………………………………16

Figure 4.5 Sequence Diagram…………………………………………………………………17

Figure 4.6 Data Flow Diagram………………………………………………………………..18

Figure 7.1 Snapshot of battery dataset…………………………………...…………………...41

Figure 7.2 Statistics of battery dataset………………..……………………………………....41

Figure 7.3 Regression results in algorithm……………………………………………….......42

Figure 7.4 Bagging Regressor…………………….…………………………………………..42

Figure 7.5 Random Forest Regressor…………………………………………………….......43

Figure 7.6 Snapshot of engine dataset………………………………………………………..43

Figure 7.7 Statistics of engine dataset ……………………………………………….............44

Figure 7.8 Feature Engineering……………………………………………………................44

Figure 7.9 Random Forest Classifier…………………………………………………………45

Figure 7.10 XGBoost…………………………………………………………………………46

LIST OF TABLES

Table 6.4.1 Test cases for Data preprocessing ……………………………………………….38

Table 6.4.2 Test cases for Model creation……………………………………………….......39

Table 6.4.3 Test cases for Feature Engineering ………………………………………….......39

Table 6.4.4 Test cases for Engine health and Battery RUL prediction……………..………...40
Predictive Maintenance for Automobiles Introduction

CHAPTER 1
INTRODUCTION

1.1 Background

Predictive maintenance for automobiles is a proactive method that employs data-driven

techniques to anticipate and prevent future vehicle component problems. Predictive
maintenance is a critical component of the automobile industry, revolutionizing how cars are
maintained and repaired by incorporating cutting-edge technologies such as machine learning
and data analytics. Because of the increasing complexity of modern cars and the growing
demand for efficient and cost-effective maintenance techniques, there has been a noticeable
shift toward predictive maintenance tactics.

Predictive maintenance uses data-driven models to forecast when maintenance is required,

allowing preventative measures to be taken before actual breakdowns occur. Predictive
maintenance has been employed on a wide range of automotive components, including
suspension systems, brakes, engines, and transmissions, to name a few. It uses sensor data,
machine learning algorithms, and advanced analytics to monitor the state of automotive
systems and predict when maintenance or repairs are required. Predictive maintenance uses
historical data and real-time sensor information to identify patterns and inconsistencies that
may indicate future concerns such as engine failures, battery degeneration, or other mechanical
defects.

Predictive maintenance is crucial in the automobile sector since vehicle emissions and fuel
consumption have a significant detrimental impact on the environment and the economy. In
view of the increased emphasis on sustainability and the need to decrease emission of
greenhouse gases, predictive maintenance can greatly enhance automobile efficiency and
reduce emissions by identifying and resolving any issues before they become critical ones.

The automotive industry relies heavily on predictive maintenance since it lowers operating
costs, streamlines maintenance schedules, and saves time for fleet managers, manufacturers,
and service providers. Cars can avoid unplanned breakdowns and ensure maximum
performance and safety while driving by anticipating maintenance needs and scheduling repairs
or part replacements at the most convenient periods.

BMSIT & M, Dept of CSE 2023-2024 1

Predictive Maintenance for Automobiles Introduction

1.2 Literature Survey

Paper [1]: Predictive maintenance enabled by machine learning: Use cases and challenges
in the automotive industry.

In this study, recent developments in maintenance modeling—especially those fueled by data-

driven approaches like machine learning (ML) have completely transformed a number of
applications, most notably in the automotive industry. Throughout a product's life cycle,
predictive maintenance (PdM) is essential for maintaining functional safety and controlling
maintenance costs. ML is a good fit for PdM in contemporary cars because of its capacity to
evaluate enormous volumes of operating data. Nevertheless, comprehensive assessments that
concentrate only on ML-based PdM in automotive systems are scarce. The impact of publicly
available data on research, the widespread application of supervised techniques needing
labelled data, and the possibility of deep learning methods—albeit contingent upon efficient
procedures and the availability of sizable labelled datasets—are noteworthy outcomes.

Paper [2]: Predictive Maintenance in the Automotive Sector

Rapid advances in sensor and network technologies have produced a wealth of condition-
monitoring data, which may be used to estimate an equipment's remaining useful life and
prevent breakdowns through the application of artificial intelligence (AI) and complicated
mathematical models. This research evaluates the literature on predictive maintenance in the
automobile sector with an emphasis on AI, statistical inference, and stochastic approaches.
Physical and hybrid models perform well with small amounts of data, while deep learning
methods require larger datasets in order to forecast failures with more precision. Digital twin
technology estimates component lifespans and diagnoses problems to improve vehicle
performance and safety. There are also issues, like the scarcity of real datasets and the difficulty
of evaluating data that is only partially labelled.

Paper [3]: Machine Learning Models Applied to Predictive Maintenance in Automotive

Engine Components.

The study focuses on predictive maintenance utilising machine learning approaches such as
Random Forest, Support Vector Machines, Artificial Neural Networks, and Gaussian Processes
in the quest for problem identification in automotive engine components. The training data
comes from a simulation testbed that replicates real driving conditions using industry-standard
testing cycles like as NEDC, EUDC, WLTP, and FTP-75. The testbed is used to diagnose
problems in turbocharged petrol engine systems. The purpose of the research is to reduce
pollutants, fuel

BMSIT & M, Dept of CSE 2023-2024 2

Predictive Maintenance for Automobiles Introduction

consumption, and maintenance costs while satisfying the automaker goals of longevity, safety,
and reliability.

Paper [4]: Use of ML Techniques for Li-Ion Battery Remaining Useful Life Prediction-A
Survey

Automobile stability and safety depend heavily on the performance and dependability of
lithium-ion batteries. Machine learning has been the subject of recent research aimed at
providing an accurate estimate of battery remaining usable life (RUL). Variables like as
voltage, current, and temperature are used as important markers of battery health in data-driven
approaches. Specifically, algorithms such as Decision Tree (DT) and Gradient Boosting (GB)
outperform other methods in RUL prediction on a wide range of performance criteria. These
methods provide excellent robustness, accuracy, and generality when assessing battery health
conditions. Testing confirms their effectiveness and shows how much better they are at
predicting battery RUL.

Paper [5]: Novel Method Based on Stacking Model for Remaining Useful Life Prediction
of Lithium-ion Batteries.

The study emphasises how important lithium-ion batteries are to cars because they are the main
component of energy storage in a variety of gadgets. These batteries deteriorate with repeated
charging, which affects dependability and performance over time. For Battery Management
Systems (BMS), it becomes essential to comprehend Remaining Useful Life (RUL). BMS can
predict faults, optimise maintenance, and extend battery life with accurate RUL prediction. In
order to improve battery health and resilience in the face of demanding operating conditions,
the study explores a variety of methodologies and Machine Learning (ML) techniques for RUL
prediction.

Paper [6]: Machine Learning Models Applied to Predictive Maintenance in Automotive

Engine Components.

Lithium-ion batteries are essential for energy storage in a variety of industries. For sustainable
operation, it is essential to precisely predict their Remaining Useful Life (RUL). To anticipate
RUL, a unique approach that makes use of stacking models incorporates several health
indicators from battery performance data. For reliable predictions, this method combines
multiple linear regression, random forests, gradient boosting decision trees, and support vector
regression. The higher performance of the stacking model, with an 11.3 RUL prediction error,

BMSIT & M, Dept of CSE 2023-2024 3

Predictive Maintenance for Automobiles Introduction
is demonstrated by testing on the Hawaii Natural Energy Institute (HNEI) dataset, highlighting
its accuracy and efficacy in estimating battery lifespan.

Paper [7]: Data-driven strategies for predictive maintenance: Lesson learned from
automotive use case.

PREPIPE, a data-driven predictive maintenance pipeline designed specifically for the

automobile industry, predicts oxygen sensor blockage in diesel engines. It provides insights
into feature relevance, data requirements, and signal selection by leveraging on-board sensors
and cloud connectivity. Thorough testing confirms its capacity to accurately forecast sensor
conditions ahead of time. It also has potential to help domain professionals improve data-driven
maintenance tactics by providing interpretability similar to that of deep learning techniques
with similar performance.

Paper [8]Prediction of Battery Remaining Useful Life Based on Multi-dimensional

Features and Machine Learning.

Although lithium-ion batteries are known for having a high energy density, their broad use is
hampered by dependability issues. In order to investigate this, a study analyses the first 100
cycles using multistage rapid charging data from MIT-Stanford datasets. Four essential
characteristics that are closely related to battery life are found by correlation research. With
modifications to training data and model types, these attributes are fed into a variety of machine
learning models. With an average error of 92.56 cycles, the top-performing model attains a
prediction error of 9.34%, indicating the effectiveness of this approach in precisely predicting
battery lifespan.

1.3 Motivation
The goal of this program is to address an acknowledged need in the automotive maintenance
sector for improved diagnostic and support systems, particularly in the areas of engine and
battery predictive maintenance. The automotive sector strives to maintain peak performance,
emphasizing the importance of identifying problems early and taking proper action to avoid
costly breakdowns. Outdated and poor reactive maintenance approaches are costly and reduce
vehicle performance. The impreciseness of current diagnostic techniques causes delays in both
maintenance and repair.

As a result, the goal is to apply cutting-edge machine learning and data analysis approaches to
create accurate predictive models of engine health and battery remaining usable life (RUL).
BMSIT & M, Dept of CSE 2023-2024 4
Predictive Maintenance for Automobiles Introduction
Predictive maintenance uses machine learning and data analytics to identify possible faults
before they become big problems, lowering the likelihood of costly repairs and downtime. The
goal is to provide trustworthy diagnostic tools to auto mechanics so that they can spot problems
early

and take appropriate action. This technique aims to reduce emissions and the environmental
impact of vehicle use by optimizing vehicle performance and maintenance schedules, while
also enhancing overall vehicle performance and economy.

The ultimate goal is to reduce operational costs while also improving vehicle dependability.
One of the project's objectives is to develop machine learning models that can accurately
predict when maintenance is required. These models will examine a variety of data sources,
including sensors, telematics systems, and maintenance records. The project's purpose is to
develop predictive maintenance procedures for various automotive parts, including brakes,
engines, and transmissions. The research is expected to produce significant benefits for the
automobile industry, such as greater driving enjoyment, overall sustainability, and efficiency in
vehicle maintenance and repair.

1.4 Problem Statement

In the automotive industry, maintaining cars' continuous operation is essential to consumer
happiness and safety. Unexpected failures and maintenance issues can be frustrating for drivers,
result in costly repairs, and damage the reputations of automakers. Conventional maintenance
practices aren't always cost-effective or efficient, such as scheduled inspections or routine
servicing. Moreover, these methods often result in either over-maintenance (replacing parts that
are still functional) or under-maintenance (producing unexpected breakdowns).Using machine
learning techniques to create an advanced predictive maintenance system is the problem. This
system needs to accurately predict when a vehicle system or component is likely to fail so that
automakers and service providers can prevent failures from occurring.

1.5 Aim and Objective

• Data Collection and Preprocessing: Compile and purify data from several sources and
sensors, addressing anomalies and missing values. Determine and develop pertinent traits that
can reveal information about the condition of car parts.

• Algorithm Selection and Training: Utilizing the prepared data, select suitable machine
learning algorithms (such as regression and classification) and train them. To find anomalies in
BMSIT & M, Dept of CSE 2023-2024 5
Predictive Maintenance for Automobiles Introduction
data streams, apply strategies such as machine learning algorithms or statistical methodologies.

• Model Evaluation: To make sure the models are effective, assess them using measures such
as accuracy, precision, recall, and F1-score. To assist users in being ready for maintenance or
replacement, the model should be able to estimate how long a machine will be useful. It ought
to be able to forecast the likelihood of a failure.

1.6 Scope
The project's scope includes developing predictive maintenance solutions for battery remaining
usable life (RUL) estimation and engine condition prediction utilizing machine learning
techniques and sensor data analysis. The initiative intends to optimize resource utilization,
decrease downtime, and lower maintenance costs across many industrial sectors, including
manufacturing, transportation, energy, and aerospace, by concentrating on proactive
maintenance tactics. The project intends to improve overall asset management procedures and
operational efficiency through data-driven decision-making processes.

1.7 Challenges
 In order to improve the precision and effectiveness of predictive maintenance solutions and
facilitate the early detection of possible problems, it is imperative to integrate state-of-the-
art sensors, intelligent machinery, and sophisticated business analytics tools.

 A predictive maintenance solution's various components must establish smooth

communication with one another in order to guarantee real-time data sharing and efficient
coordination, which will enable prompt decision-making and action.

 Even though predictive maintenance solutions have many long-term advantages, many find
themselves faced with the challenge of high upfront costs that necessitate careful
consideration of investment strategies and ROI analysis.

BMSIT & M, Dept of CSE 2023-2024 6

Predictive Maintenance for Automobiles Introduction

BMSIT & M, Dept of CSE 2023-2024 7

Predictive Maintenance for Automobiles Overview

CHAPTER 2
OVERVIEW
Predictive maintenance is emphasized heavily in this project, especially in the area of engine
condition prediction, where supervised machine learning algorithms are used to generate
precise predictions on the condition of automobile engines. By carefully examining vital engine
characteristics such as RPM, oil pressure, temperature, and coolant pressure, the prediction
algorithm may foresee possible defects or indications of a decline in performance, allowing for
the implementation of preventive maintenance.

For a number of reasons, preventative maintenance is essential. By addressing potential flaws

before they become significant difficulties, it helps minimize costly repairs and downtime while
guaranteeing smooth and efficient engine running. This method also prolongs the life of the
vehicle while minimizing emissions and environmental effect, increasing safety, and
optimizing vehicle performance. It also reduces early wear and tear, lowers maintenance costs,
and improves overall vehicle reliability.

A crucial component of maintaining electric and hybrid vehicles, battery remaining usable life
(RUL) estimates are also explored in this study. The model can forecast how long a battery will
live by continually monitoring important battery health parameters like voltage and current.
This allows for timely interventions like replacements or maintenance to be performed. In
general, the project seeks to use the advantages of predictive maintenance in order to lower the
total cost of automobile ownership, improve vehicle reliability, encourage environmentally
friendly modes of transportation, and guarantee effective use of battery resources.

BMSIT & M, Dept of CSE 2023-2024 8

Predictive Maintenance for Automobiles Requirement Specification

CHAPTER 3
REQUIREMENT SPECIFICATION

3.1 Mapping of Requirements

The process of matching project objectives, goals, and specifications with particular features,
functionalities, and deliverables is known as requirements mapping. It entails determining and
recording how each need advances the goals and scope of the project as a whole. Throughout
the project lifecycle, this mapping guarantees that all project requirements are precisely stated,
comprehended, and applied. It also makes it easier for developers, project teams, and
stakeholders to communicate effectively, which promotes successful project outcomes and
stakeholder satisfaction.

User Requirement: To communicate with the predictive maintenance system, a user-friendly

interface is required.

Domain Requirement: Data on engine health and battery performance must be accessible
from pertinent databases.

Hardware requirement: It includes enough RAM to perform data processing operations

effectively and compatibility with contemporary CPUs.

Software Requuirement: Installing Python and the required libraries, including NumPy,
Pandas, and Scikit-learn, is a software requirement for machine learning operations.

System Requirement: The predictive maintenance system's many components must establish
seamless communication with one another.

Cost Requirement: Taking into account the initial expenses related to putting the predictive
maintenance system into practice.

Performance Requirement: Ensuring effective coordination and real-time data exchange

between the system's several components is a performance requirement.

Scalability Requirement: The system must be able to grow to accommodate changing

business requirements and growing data volumes.

Requirement for precision: Reaching a high degree of precision in determining the health of
engines and calculating the batteries' remaining usable lives.

BMSIT & M, Dept of CSE 2023-2024 9

Predictive Maintenance for Automobiles Requirement Specification

Reliability Requirement: Predictive maintenance system must operate consistently to prevent

interruptions to vital processes.

3.2 Functional Requirements

• Data Collection and Integration: The system should be able to gather information from the
cars' numerous sensors and sources.It should facilitate the integration of various data kinds,
such as environmental conditions, previous maintenance records, and real-time data.

• Preprocessing of Data: Capacity to clean and prepare unprocessed data, including handling
outliers, missing values, and data normalization.

• Feature Engineering: This refers to techniques for feature extraction and selection that help
find pertinent features in data for predictive modeling

• Machine Learning Algorithms: Using machine learning models, such as regression,

classification, and anomaly detection algorithms, for predictive maintenance. Deep learning
models are supported for more intricate data analysis.

• Validation and Training of Models: The ability to use relevant evaluation measures to
validate the performance of machine learning models that have been trained on historical data.
To maximize model accuracy, use hyperparameter adjustment and cross-validation.

3.3 Non-Functional Requirements

Reliability: The technology minimizes false alarms and ensures reliable maintenance
recommendations by reliably predicting engine conditions and battery RUL.

Performance: In order to facilitate preventive maintenance activities and save downtime,

algorithms are tailored to handle sensor data quickly and efficiently. This results in timely
predictions.

Portability: The system can be easily adopted and used on multiple platforms and systems
because it is built to be integrated with current infrastructure and implemented in a variety of
industrial settings.

Scalability: As the size of the dataset and user base increase, the system's ability to manage
growing volumes of sensor data and meet rising computational needs will guarantee its
continued effectiveness.

BMSIT & M, Dept of CSE 2023-2024 10

Predictive Maintenance for Automobiles Requirement Specification

Flexibility: Users can modify parameters and algorithms to meet particular maintenance
demands and operating requirements thanks to the solution's adjustable configurations and
flexible models.

Security: Throughout the predictive maintenance process, strong security measures are put in
place to safeguard private information and guarantee its availability, confidentiality, and
integrity.

3.4 User Requirements

Data management: Uploading datasets for analysis ought to be possible for users. Commonly
used data formats like CSV, Excel, and JSON should be supported by the system.

Model Training: It should be possible for users to choose and hone various machine learning
models. Model evaluation and hyperparameter tuning should be possible with the system.

Prediction and Visualisation: Using trained models, users ought to be able to make
predictions. The system ought to produce visual aids that aid users in comprehending model
performance and data-driven insights.

Interactivity and Usability: The user interface should be simple to use and intuitive, making it
appropriate for users with different degrees of technical proficiency. It should be possible for
users to modify and engage with visualisations.

3.5 Domain Requirements

Access to engine health data: availability of extensive datasets including engine health
metrics like temperature, fuel pressure, oil pressure, and RPM.

Access to battery performance data: Datasets with measures for battery performance, such
as cycle index, discharge time, voltage levels, and charging duration, are accessible.

Understanding of automobile system: Knowledge of automotive systems and parts to

correctly analyze sensor data and spot any problems.

Knowledge of predictive maintenance techniques: Knowledge of methods and strategies for

predictive maintenance in order to create predictive models that work.

Data Preprocessing Expertise: The ability to prepare datasets for analysis using methods
including cleaning, normalization, and feature extraction.

BMSIT & M, Dept of CSE 2023-2024 11

Predictive Maintenance for Automobiles Requirement Specification

Machine Learning Skills: Proficiency in machine learning algorithms and methodologies for
model training, assessment, and forecasting is referred to as machine learning skills.

Understanding of Performance Metrics: Understanding of performance measures for

assessing predictive models, including as recall, accuracy, precision, and F1-score.

Knowledge of Industry Standards: Knowledge of industry rules and guidelines concerning

the upkeep and safety needs of automobiles.

3.6 System Requirements

3.6.1 Software Requirements

 Operating system: Windows 7 above
 Coding Language: Python, HTML
 Version: Python 3.6.8
 Online Environment: Google Colab
 Python Libraries such as:
o NumPy
o Pandas
o Matplotlib
o Seaborn
o Scikit-learn
o XGBoost

3.6.2 Hardware Requirement

 System :Pentium IV 2.4 GHz.
 Hard Disk : 500 GB.
 Ram: 4 GB.
 Any desktop/Laptop system with the above configuration or higher level.

BMSIT & M, Dept of CSE 2023-2024 12

Predictive Maintenance for Automobiles Detailed Design

CHAPTER 4
DETAILED DESIGN

4.1 System Architecture

Fig 4.1: System Architecture

The Fig 4.1 represents the system architecture for predictive maintenance on engines and
batteries is depicted in the system architecture diagram. Initial data preprocessing include
cleaning, handling of missing values, and normalization of raw datasets containing engine and
battery sensor data. To improve prediction accuracy, feature engineering then takes the
pertinent data and builds new features. Model selection then finds appropriate machine learning
techniques for estimating battery remaining useful life and predicting engine condition. Using
the preprocessed data, the selected models are trained, and their parameters are changed to
maximize their performance. In order to guarantee the dependability and efficiency of the
predictive maintenance system for engine and battery components, the model evaluation
process evaluates the prediction accuracy using pertinent criteria.

BMSIT & M, Dept of CSE 2023-2024 13

Predictive Maintenance for Automobiles Detailed Design

4.2 System Design

Fig 4.2 System Design

Fig 4.2 Represents the work flow model which contains the following modules.

Data Collection
Identifying relevant data sources, such as engine sensor data (e.g., temperature,
pressure, RPM), maintenance logs or records. Historical performance data External
datasets on engine status or battery conditions.

Data Preprocessing

Clean up the data by fixing issues like missing values. Handling outliers: Identify,
eliminate, or rectify any outliers that might have a negative impact on model
performance. Data normalization or scaling: Scale numerical characteristics to a similar
range to prevent larger magnitudes from dominating the model. Perform data
transformations and encoding as needed. Convert categorical variables to numerical
representations using one-hot or label encoding. If the feature scales differ, scale them.
To analyze model performance, divide the data into training and testing sets and use
methods like train-test split or cross-validation.

BMSIT & M, Dept of CSE 2023-2024 14

Predictive Maintenance for Automobiles Detailed Design

Feature Engineering
Examine the data to identify any potentially relevant characteristics or modifications. Create
new features based on domain expertise or insights gathered via data study. To extract crucial
information from raw data, generate rolling averages or aggregate data over time intervals.
Select the most relevant characteristics for the target variable and remove those that are
redundant or unnecessary.

Model Selection
A variety of machine learning algorithms must be explored for the engine and battery
predictive maintenance project, including classification (engine condition prediction) and
regression (battery remaining usable life prediction). Random Forest, Logistic Regression, and
Gradient Boosting algorithms are appropriate for engine condition prediction due to their
ability to deal with categorization issues. Random Forest Regression, XGBoost Regression, and
Decision Tree Regression are good algorithms for forecasting the remaining usable life of
batteries.

Model Training and evaluation

The trained models are assessed on the testing set, using suitable evaluation criteria according
on the nature of the tasks. Accuracy, precision, recall, and F1-score are used to evaluate
classification model performance. For regression tasks, measures like as mean squared error,
mean absolute error, and R-squared are used to evaluate model prediction performance.

BMSIT & M, Dept of CSE 2023-2024 15

Predictive Maintenance for Automobiles Detailed Design

4.3 Activity Diagram

Fig 4.3 Activity Diagram

Fig 4.3 displays the activity diagram and contains the following modules.

Collection of engine and battery data: The initial stage of the procedure involves gathering
data from the engine and battery. Sensor readings like voltage, current, and Engine RPM , Fuel
pressure and so on are included in this data.

Data preprocessing: To get the data ready for the machine learning model, it is first
preprocessed. This entails scaling, eliminating outliers, and cleaning the data.

Splitting the dataset into training and test data: Next, two sets of the preprocessed data are
created: a test set and a training set. The machine learning model is trained on the training set,
and its performance is assessed on the test set.

BMSIT & M, Dept of CSE 2023-2024 16

Predictive Maintenance for Automobiles Detailed Design

Training the model using training data: Subsequently the training data is used to train the
machine learning model. The RUL of a battery can be predicted by the model by recognising
patterns in the data.

Making prediction on test data: After the model has been trained, test data predictions are
made using it. Every battery in the test set has its RUL predicted by the model.

Engine health condition: The model predicts whether or not the engine needs maintenance. It
then makes a decision.

Battery RUL Prediction: The battery's replacement status is ascertained by the model based
on its prediction.

4.4 Use Case Diagram

Fig 4.4: Use Case diagram

The fig 4.4 represents the use case diagram which depicts the interaction between the system
administrator and its functionalities. The process is started by the administrator loading the
dataset, which includes pertinent sensor data for the battery and engine components. The
system can then forecast two primary conditions: the health of the engine and the amount of
battery life left in it, thanks to the admin's application of classification and regression
BMSIT & M, Dept of CSE 2023-2024 17
Predictive Maintenance for Automobiles Detailed Design
algorithms to the dataset. After making a prediction, the administrator monitors a number of
metrics to assess the

predictive maintenance system's performance and accuracy. These metrics aid in directing
decision-making for maintenance actions and offer insights into how effective the algorithms
are.

4.5 Sequence Diagram

Fig 4.5: Sequence diagram

Fig 4.5 displays the sequence diagram, in which the user begins the process by collecting
various data attributes from automobiles linked to the engine and battery parameters. The
dataset is made up of all these properties. After the information has been collected, it is used to
extract knowledge and perform classification tasks based on the characteristics. During
classification, the dataset's characteristics are classified and labelled using established criteria
or techniques. Following the classification process, classification results are generated using the
dataset's characteristics.

The dataset's performance is evaluated at the conclusion of the classification procedure and
result generation. The categorization process's performance is evaluated using measures such as
recall, accuracy, precision, and other relevant metrics. The user is subsequently notified with

BMSIT & M, Dept of CSE 2023-2024 18

Predictive Maintenance for Automobiles Detailed Design
the evaluated performance measures, which provide information on the efficiency and
reliability of the dataset classification method.

4.6 Data Flow Diagram

Fig 4.6.1: DFD Level 0

Fig 4.6.1 DFD-0 provides an overview of the system architecture and data flow at a high level.
This graphic depicts how the machine learning model receives input parameters from the
engine dataset, such as temperature measurements, engine RPM, fuel pressure, coolant
pressure, and lubricating oil pressure. These parameters are the qualities or attributes used by
the model to make predictions.

After the data has been entered into the model, the machine learning algorithms process it and
provide output predictions. It demonstrates how the system determines how accurate the
forecasts are. This evaluation step is critical for establishing the predictive maintenance
system's dependability and performance.

BMSIT & M, Dept of CSE 2023-2024 19

Predictive Maintenance for Automobiles Detailed Design

Fig 4.6.2: DFD Level 1

The DFD-1 diagram in Figure 4.6.2 displays the engine dataset's detailed process flow inside
the predictive maintenance system. First, the raw input data from the engine dataset is
preprocessed. Following preprocessing, the preprocessed data is passed to the feature
engineering phase. These characteristics are intended to discover significant patterns and
correlations in the data.

The dataset is then separated into two categories, training and testing. The training dataset is
input into the appropriate machine learning algorithms to train the model. This procedure
entails fitting the algorithms to the data in order to uncover patterns and relationships that will
allow for precise predictions. Concurrently, the testing dataset is used for model assessment.
The model's performance and efficiency are evaluated using metrics like as accuracy, precision,
recall, and mean squared error.

BMSIT & M, Dept of CSE 2023-2024 20

Predictive Maintenance for Automobiles Detailed Design

BMSIT & M, Dept of CSE 2023-2024 21

Predictive Maintenance for Automobiles Implementation

CHAPTER 5

IMPLEMENTATION

5.1 Programming Language

The programming language used in the project is Python. Python is the endeavor's core
programming language, offering a strong basis for data analysis, preprocessing, and machine
learning activities. Python is known for its simplicity and readability, as well as its
straightforward syntax, which makes code development and maintenance more efficient.
Because of its adaptability, it can be used to apply different algorithms and methodologies,
which makes it appropriate for a wide range of data science and machine learning applications.
Its adaptability allows for the implementation of a variety of methods and methodologies,
making it suited for a wide range of data science and machine learning applications.

Python's dynamic typing and automatic memory management make development processes
more efficient, enabling programmers to solely concentrate on problem solving instead of low-
level minute details. Furthermore, Python's huge standard library includes a wide range of
built-in functions and modules, which improves its capabilities for data manipulation, file
management, and other tasks. The provided code snippets are written in Python. Python is a
high-level, interpreted programming language renowned for its accessibility, readability, and
adaptability. It supports a variety of programming paradigms, including procedural, object-
oriented, and functional programming, making it useful for a wide range of applications. It is
popular in industry, academia, and research due to its ease of use and widespread community
support.

Python offers a wide range of frameworks and libraries for a variety of applications, including
data analysis, machine learning, web development, and scientific computing. The project makes
use of several built-in library functions, including:

NumPy (np): The foundational module for Python scientific computing is called NumPy.
Large multi-dimensional arrays and matrices are supported, and a number of mathematical
operations are available for effective manipulation of these arrays.

Pandas (pd): Pandas is an effective Python data analysis library. It provides data structures that
make data management, exploration, and cleaning simple, such as DataFrame and Series.

BMSIT & M, Dept of CSE 2023-2024 22

Predictive Maintenance for Automobiles Implementation

Scikit-learn(sklearn): Scikit-learn is a well-liked Python machine learning library. It offers

many methods for both supervised and unsupervised learning, such as dimensionality
reduction, clustering, regression, and classification

Seaborn (sns): Seaborn is a Matplotlib-based library for data visualization. It offers a

sophisticated interface for making eye-catching and educational statistical visuals.

TensorFlow (tf): Google developed this open-source machine learning framework. In addition
to support for distributed computation and deployment across several platforms, it offers tools
for creating and refining deep learning models.

Matplotlib (plt): Matplotlib is a feature-rich Python visualization toolkit for static, animated,
and interactive graphics. Numerous other types of plots, charts, and graphs can be produced
using it.

StandardScaler: The Scikit-learn library's StandardScaler preprocessing module. It is applied

to remove the mean and scale to the unit variance in order to normalize characteristics.

train test split: The Scikit-learn library's train test split function divides a dataset into training
and testing sections. It is frequently used to assess model performance in machine learning.

Pipeline: Scikit-learn's Pipeline module facilitates the chaining of several estimators into a
single pipeline. Building models and processing data sequentially can both benefit from this.

ColumnTransformer: This Scikit-learn preprocessing module enables the application of

several preprocessing stages to individual columns or subsets of columns within a dataset.

SimpleImputer: To impute missing values from a dataset, use the SimpleImputer

preprocessing package from Scikit-learn. It substitutes a predetermined approach—such as
mean, median, or most frequent—for missing values.

5.2 ALGORITHMS
The following are the algorithms used to predict the Engine condition and Battery RUL

Random Forest Algorithm: A flexible and effective ensemble learning method for both
classification and regression applications is the Random Forest algorithm. In order to lower the
danger of overfitting and increase the accuracy of the model, it works by building numerous
decision trees during the training phase and outputting the average forecast of the individual
trees. The robustness and efficacy of Random Forest are well-known, especially for managing

BMSIT & M, Dept of CSE 2023-2024 23

Predictive Maintenance for Automobiles Implementation
complicated datasets with high dimensionality and heterogeneous data types. It is appropriate
for

a variety of applications since it can handle both numerical and categorical features without
requiring a great deal of preprocessing.

The capacity of Random Forest to capture intricate nonlinear correlations and interactions
between characteristics is one of its main advantages. This makes it an excellent choice for
situations in which the underlying data distribution is difficult to separate linearly. Furthermore,
Random Forest gives an inherent measure of feature relevance, which enables users to learn
which features have the most influence on predictions.

Pseudocode

Precondition: A training set, engine_dataset ={(x_1,y_1 ),(x_2,y_2 ),….,(x_n,y_n )}

Engine features, and number of trees in forest B.
1. function Random_Forest(engine_dataset, Engine_features, B)
2. H ← Empty set of trees
3. for i ∈ 1, ..., B do
4. S_i ← Bootstrap sample from engine_dataset
5. h_i ← Randomized_Tree_Learn(S_i, Engine_features)
6. H ← H ∪ {h_i}
7. end for
8. return H
9. end function
10. function Randomized_Tree_Learn(S, F)
11. At each node:
12. f ← Random subset of features from Engine_features
13. Split on best feature in f based on engine health condition
14. return The learned tree
15. end function

Decision Tree Algorithm: Ideal for both regression and classification problems, the Decision
Tree algorithm is a flexible and easy-to-understand machine learning technique. It creates a
tree-like structure with internal nodes representing features and leaf nodes representing class

BMSIT & M, Dept of CSE 2023-2024 24

Predictive Maintenance for Automobiles Implementation
labels or regression values by recursively partitioning the data into subsets based on the values
of input features.

Decision trees are especially helpful for comprehending the underlying reasoning behind a
model's predictions because of their high interpretability. The decision path leading to a
specific outcome can be readily traced by following the branches of the tree, which makes it
useful for both explanatory and predictive purposes.

Decision trees' capacity to handle both numerical and categorical data without requiring a lot of
preprocessing is one of its main advantages. They can also handle missing values by using
methods like imputation or surrogate splits, and they are resistant to outliers .Decision trees are
a simple tool for modelling complex decision boundaries because they can capture complex
nonlinear relationships in the data. Equ (1) represents the Gini index.
n
Gini Index=1−∑ ¿ ¿
i=1

¿ 1−¿ (1)
where,
P_+ is the probability of positive class and
P_- is the probability of negative class

Pseudocode
GenDecTree(Engine_Data S, Engine_Features F) Steps:
1. If stopping_condition(Engine_Data S, Engine_Features F) = true then
a. Leaf = createNode()
b. leafLabel = classify(s)
c. return leaf
2. root = createNode()
3. root.test_condition = findBestSplit(Engine_Data S, Engine_Features F)
4. V = {v | v is a possible outcome of root.test_condition}
5. For each value v in V:
a. S = {s | root.test_condition(s) = v and s belongs to Engine_Data S}
b. Child = TreeGrowth(S, F)
c. Add child as descendant of root and label the edge {root → child} as v
6. Return root

BMSIT & M, Dept of CSE 2023-2024 25

Predictive Maintenance for Automobiles Implementation
XGBoost: A more sophisticated version of gradient boosting, which is a machine learning
technique that creates a sequence of decision trees in order to generate predictions, is called
XGBoost (Extreme Gradient Boosting). Because of its effectiveness, scalability, and superior
performance in predictive modelling tasks, XGBoost is highly regarded.

XGBoost's performance and speed optimisation is one of its primary features. To improve
computational efficiency and reduce overfitting, it makes use of a number of strategies,
including regularisation, tree pruning, and parallel processing. Because of this, XGBoost works
especially well with large-scale, highly dimensional datasets.

XGBoost works by iteratively adding new decision trees to the model, each one focusing on the
residuals or errors of the previous trees. It optimizes the model's performance by minimizing a
specified loss function through gradient descent, effectively learning from the mistakes of the
previous trees and improving predictive accuracy with each iteration. Equ (2) represents the
XGBoost objective function..

n
L =∑ l ( yi , ^y (ti −1)+ f t ( x i ) ) +Ω(f t )
(t )
(2)
i=1

seen as f(x+△x)
where ,
(t−1)
x= ^y i

Bagging Regressor: Applying Bootstrap Aggregating (Bagging) Regressor is a meta-

estimator algorithm that fits base regressors on random subsets of the original dataset in order
to improve their performance. It works on the basis of the ensemble learning principle, which
combines the predictions of several independently trained models to produce the final
prediction.

The Bagging Regressor operates by bootstrapping—a technique in which samples are selected
at random with replacement—to produce several subsets of the training data. A base regressor
model, like a decision tree or random forest, is then trained using each subset. To produce the
final result during prediction, the Bagging Regressor combines each of these base models'
individual predictions.

Reducing overfitting and variance in the model is one of the main benefits of using a bagging
regression. Bagging helps to capture various facets of the underlying data distribution and

BMSIT & M, Dept of CSE 2023-2024 26

Predictive Maintenance for Automobiles Implementation
average out the predictions, producing more stable and reliable predictions. This is achieved by
training multiple base regressors on various subsets of the data.

Pseudocode
Input: Data set battery_dataset={(x_1,y_1 ),(x_2,y_2 ),….,(x_n,y_n )};
Basic learning algorithm L;
Number of learning rounds K.

Process:
for k=1,…,K:
〖battery〗_k=Bootstrap(TR) %Generate a bootstrap sample from battery_dataset
h k=L(battery) %Train a base learner h_(k ) from the bootstrap sample
End
K
Output: H (x )=argmax y∈Y ∑ l ( y h k ( x ) ) % the value of l ( a )
k=1

is 1if a istrue∧0 otherwise

Logistic Regression: When there are only two possible outcomes for the target variable, binary
classification tasks can be solved statistically using logistic regression. Logistic regression is
mostly used for classification, not regression, despite its name. It employs a logistic function,
commonly referred to as the sigmoid function, to model the likelihood that an instance belongs
to a specific class.

The logistic function, which maps the output to a value between 0 and 1, is applied after the
input features are linearly combined with weights in logistic regression. The likelihood that the
instance belongs to one of the two classes is indicated by this value. The instance is usually
classified into the positive class if the probability is greater than a predetermined threshold
(e.g., 0.5); if not, it is classified into the negative class.

The ease of use and interpretability of logistic regression is one of its main benefits. The
relationship between the input features and the binary outcome is clearly understood by using
the model. Furthermore, logistic regression is flexible for handling a variety of data types
because it can handle both numerical and categorical features. Equ (3) Represents the equation
of logistic regression and Equ (4) Represents sigmoid function of logistic regression.

1
f ( x )= −2 (3)
1+e
where,
BMSIT & M, Dept of CSE 2023-2024 27
Predictive Maintenance for Automobiles Implementation
e = base of natural logarithms
value = numerical value one wishes to transform

(b0 +b1 X )
e
y= (b + b X ) (4)
1+b 0 1

where,
x = input value
y = predicted output

b0 = bias or intercept term

b1 = coefficient for input (x)

Extra Trees Regressor: The Extra Trees Regressor is critical in forecasting the residual usable
life (RUL) of lithium-ion batteries used in automobile vehicles. As a member of the ensemble
learning family, the Extra Trees Regressor works similarly to random forests, generating
numerous decision trees to produce predictions. Unlike standard random forests, Extra Trees
adds unpredictability by setting random thresholds for each feature and building each tree from
the full dataset. This technique increases tree variety, which may reduce overfitting while
enhancing robustness.

One notable advantage of the Extra Trees Regressor is its ability to properly handle large
datasets with noisy or irrelevant characteristics. In our project, where we deal with a large
amount of sensor data collecting various engine health measures, this functionality is vital. The
Extra Trees Regressor, by exploiting randomization in feature selection and tree construction,
may successfully capture complicated correlations between input features and the target
variable, such as engine condition or remaining usable life.

Furthermore, the Extra Trees Regressor strikes a compromise between bias and variance, which
makes it appropriate for the battery health prediction job. With its robust performance and
efficient computational characteristics, the Extra Trees Regressor emerges as a reliable choice
for accurately estimating the remaining useful life of lithium-ion batteries, significantly
improving the predictability and security of battery-powered automobiles.

5.3 PROPOSED DESIGN

Data collection: Collect engine parameter data, such as rpm, coolant pressure, fuel pressure,
lubricating oil temperature, and coolant temperature, from sensors mounted in cars or other

BMSIT & M, Dept of CSE 2023-2024 28

Predictive Maintenance for Automobiles Implementation
machines. Additionally, get historical data from battery management systems or Internet of
Things sensors on lithium-ion battery characteristics including voltage, current, temperature,
and cycle count.

Data Preprocessing: To guarantee consistency and dependability in the data, carry out
preprocessing procedures such as addressing missing values, eliminating outliers, and
standardizing characteristics for both the engine and battery datasets.

Exploratory Data Analysis (EDA): To comprehend the distribution of each characteristic,

find correlations, and investigate links between variables, do EDA independently for the engine
and

battery datasets. This study will aid in feature engineering and selection as well as provide
insights into the properties of the data.

Feature Engineering: Create new features for both datasets that could improve the models'
capacity for prediction. Features such as rolling statistics or ratios between parameters could be
helpful for the engine dataset. Features that capture patterns of degradation, like trend analysis
or time-series decomposition, can be built for the battery dataset.

Model Selection: Determine the best machine learning models for both tasks depending on the
nature of the data and the situation at hand. Decision Trees, Random Forests, and Support
Vector Machines may all be used to anticipate engine conditions. For battery RUL prediction,
regression models like as Random Forest Regression, Gradient Boosting Regression, and Long
Short-Term Memory (LSTM) networks for time-series data may be appropriate.

Hyperparameter Tuning: Employ approaches like as grid search or random search to increase
model performance and generalization.

Model Evaluation: Utilize appropriate evaluation metrics like accuracy, precision, recall, F1-
score, mean squared error (MSE), root mean squared error (RMSE), or mean absolute error
(MAE) to assess the performance of the trained models for engine condition prediction and
battery RUL prediction.

5.4 CODE
The following code snippets is to predict the Engine condition

BMSIT & M, Dept of CSE 2023-2024 29

Predictive Maintenance for Automobiles Implementation
1. Exploratory data analysis: Exploring engine data entails reviewing information, descriptions,
and value counts for engine condition.
engine_df.info()
engine_df.describe()
engine_df["Engine Condition"].value_counts()

2. Splitting engine data into features and variables

engine_features = engine_df.drop("Engine Condition", axis=1)
engine_labels = engine_df["Engine Condition"]

3. Data Preprocessing : create a custom transformer to add new attributes based on engine
features.
class AttributesAdder(BaseEstimator, TransformerMixin):

# Constructor of the Class

def __init__(self, add_oil_system=True, add_coolant_system=True):
self.add_oil_system = add_oil_system
self.add_coolant_system = add_coolant_system

def fit(self, X, y=None):

return self
def transform(self, X):
if self.add_oil_system:
oil_efficiency = 1 / (X[:, rpm_idx] * X[:, oil_temp_idx])
X = np.c_[X, oil_efficiency]
if self.add_coolant_system:
cool_efficiency = (1 / X[:, rpm_idx]) * X[:, coolant_temp_idx]
X = np.c_[X, cool_efficiency]
return X
attr_addr = AttributesAdder()
engine_prep = attr_addr.transform(engine_features.values)
print(f"Transformed Data: {engine_prep[0, :]}")

4. Pipeline for adding new properties and standardizing features to automate data preparation
engine_prep_pipe = Pipeline([
("attr_adder", AttributesAdder()),
("std_scaler", StandardScaler())
])

engine_data_prepared = engine_prep_pipe.fit_transform(engine_features.values)
BMSIT & M, Dept of CSE 2023-2024 30
Predictive Maintenance for Automobiles Implementation
engine_data_prepared[0, :]

5. Splitting the Prepared Dataset

X_train, X_test, y_train, y_test = train_test_split(engine_data_prepared, engine_labels,
test_size=0.1, random_state=42)
print(f"Shape X Train: {X_train.shape}")
print(f"Shape y Train: {y_train.shape}\n")
print(f"Shape X Test: {X_test.shape}")
print(f"Shape y Test: {y_test.shape}\n")

6. Applying Machine Learning algorithms

1. Logistic Regression
log_reg = LogisticRegression()

log_reg.fit(X_train, y_train)
validation = log_reg.predict(X_test)
score = sum(validation == y_test)
print(f"Score: {score / len(y_test)}")

2. Decision Tree Classification

tree_cls = DecisionTreeClassifier()
tree_cls.fit(X_train, y_train)
validation = tree_cls.predict(X_test)
score = sum(validation == y_test)
print(f"Score: {score / len(y_test)}")

3. Random Forest classification

forest_cls = RandomForestClassifier()
forest_cls.fit(X_train, y_train)
validation = forest_cls.predict(X_test)
score = sum(validation == y_test)
print(f"Score: {score / len(y_test)}")

The following code snippets is to predict the Battery RUL

1. Splitting dataset into target and features

target = df['RUL']

BMSIT & M, Dept of CSE 2023-2024 31

Predictive Maintenance for Automobiles Implementation
features = df.drop(['RUL'], axis=1)
target.shape, features.shape
((15064,), (15064, 8))
features = features.drop(['Cycle_Index'], axis=1)
features.shape

2. Standardizing of features to transform data and maintain consistency

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
features_std = scaler.fit_transform(features)
features_std = pd.DataFrame(features_std, columns = features.columns)
features_std

3. Splitting the standardized features into training and testing datasets.

from sklearn.model_selection import (train_test_split, StratifiedKFold)
X_train, X_test, y_train, y_test = train_test_split(features_std, target, test_size=0.2,
random_state=2301)

4. Comparing the regression models

def prepare_model(algorithm, X_train, y_train):
model = Pipeline(steps=[('preprocessing', pipeline),('algorithm', algorithm)])
model.fit(X_train, y_train)
return model
names = []
times = []
mse = []
rmse = []
for algorithm in algorithms:
name = type(algorithm).__name__
names.append(name)
start_time = time.time()
model = prepare_model(algorithm, X_train, y_train)
pred = model.predict(X_test)
end_time = time.time()
times.append(end_time - start_time)
mse.append(mean_squared_error(y_test, pred))

BMSIT & M, Dept of CSE 2023-2024 32

Predictive Maintenance for Automobiles Implementation
rmse.append(np.sqrt(mean_squared_error(y_test, pred)))
print('Regression Results in Algorithms')
results_dict = {'Algorithm': names, 'MSE': mse, 'RMSE': rmse, 'Time': times}
pd.DataFrame(results_dict).sort_values(by='RMSE', ascending=1)

5. Applying the algorithms

1. Random Forest Regressor
rfr = RandomForestRegressor(random_state=2301, n_estimators=100)
rfr.fit(X_train, y_train)
print(rfr.score(X_train, y_train))
print(rfr.score(X_test, y_test))
rfr_pred = rfr.predict(X_test)
rfr_rmse = np.sqrt(mean_squared_error(y_test, rfr_pred))
print(rfr_rmse)

2. Bagging Regressor
br = BaggingRegressor(random_state=2301)
br.fit(X_train, y_train)
print(br.score(X_train, y_train))
print(br.score(X_test, y_test))
br_pred = br.predict(X_test)
br_rmse = np.sqrt(mean_squared_error(y_test, br_pred))
print(br_rmse)

3. Extra Tree Regressor

etr = ExtraTreeRegressor(random_state=2301)
etr.fit(X_train, y_train)
etr.fit(X_train, y_train)
print(etr.score(X_train, y_train))
print(etr.score(X_test, y_test))
etr_pred = etr.predict(X_test)

etr_rmse = np.sqrt(mean_squared_error(y_test, etr_pred))

print(etr_rmse)
4. Decision Tree Regressor
dtr = DecisionTreeRegressor(random_state=2301)

BMSIT & M, Dept of CSE 2023-2024 33

Predictive Maintenance for Automobiles Implementation
dtr.fit(X_train, y_train)
dtr.fit(X_train, y_train)
print(dtr.score(X_train, y_train))
print(dtr.score(X_test, y_test))
dtr_pred = dtr.predict(X_test)
dtr_rmse = np.sqrt(mean_squared_error(y_test, dtr_pred))
print(dtr_rmse)

BMSIT & M, Dept of CSE 2023-2024 34

Predictive Maintenance for Automobiles Testing

CHAPTER 6

TESTING
6.1 Testing in Machine Learning
Machine learning is the method of using algorithms and analytics to make a machine learn on
its own without being explicitly programmed. Computers rely on an algorithm that employs a
mathematical framework. In the context of Machine Learning models, the term "testing" is
largely used to assess model performance in terms of accuracy/precision. It should be
emphasized that the term "testing" has various meanings for traditional software development
and machine learning model creation.
As mentioned before, typical unit/integration testing does not work with machine learning
models, so they undergo testing based on accuracy and prediction.
Accuracy: It is one parameter used to evaluate classification methods. Informally, accuracy
refers to the percentage of correct predictions made by our model. Formally, accuracy is
defined as follows:
Accuracy = the number of correct forecasts divided by the total number of forecasts
Accuracy = TP + TN/TP + TN + FP + FN
Where
TP stands for True Positives,
TN for True Negatives,
FP for False Positives, and
FN for False Negatives.
Precision and recall are frequently employed as metrics to assess categorization methods.
Precision (also known as positive predictive value) is the proportion of relevant examples
among the retrieved instances, whereas recall (also known as sensitivity) is the fraction of the
total number of relevant instances that were actually recovered.
Precision equals TP/TP + FP.
Recall is TP/TP + FN.

A test dataset is one that is separate from the training dataset but has the same probability
distribution as the training dataset. If a model fits well on the training dataset, it should likewise
fit well on the test dataset. Thus, by comparing the expected and observed values, we may
determine how well our model works.
Machine learning testing starts off with data preparation, which involves cleaning and
preprocessing the engine health and battery RUL datasets. We divided the data into training and
testing sets to ensure that both sets were descriptive of the full distribution. The models, such as
BMSIT & M, Dept of CSE 2023-2024 35
Predictive Maintenance for Automobiles Testing
logistic regression for engine condition prediction and regression models for battery RUL
prediction, are evaluated using testing data after they have been trained on the data used for
training. Performance indicators including as accuracy, precision, recall, and F1-score are used
to evaluate the models' prediction ability.
Overall, testing in this project takes an intensive approach to ensuring the reliability, accuracy,
and interpretability of the machine learning models used for engine state prediction and battery
RUL calculation. It intend to develop robust prediction models that may successfully support
predictive maintenance efforts in automotive systems through rigorous testing and evaluation.

6.2 Testing Objectives

The aims and purposes of the testing procedure throughout the project or system development
lifecycle are included in the testing objectives. The software or product being tested's
functionality, quality, and dependability are all intended to be ensured by these goals. These
usually involve confirming that the program satisfies predetermined requirements, locating
errors or malfunctions, evaluating performance in different scenarios, guaranteeing usability
and accessibility, and confirming adherence to rules and guidelines. Evaluation of the
software's security, scalability, and interoperability as well as an assessment of its general
preparedness for deployment and use in real-world circumstances are additional testing
objectives. In the end, testing goals help to reduce risks, raise user satisfaction, and increase the
software's quality and dependability.

The project's testing goals include a comprehensive assessment of the battery RUL prediction
algorithms and engine condition models. This entails carefully examining how well predictions
match known data, confirming that the models can generalize well to new datasets, and
evaluating how robust the models are to fluctuations and noise in the input data. The models are
evaluated using metrics such as accuracy, precision, recall, and error rates in order to determine

which methods are the best fit. Furthermore, extensive real-world simulation tests and
deployment readiness evaluations are carried out to guarantee the dependability and usefulness
of the models in operational contexts. The project's goal is to provide reliable and accurate
predictive models for battery RUL estimate and engine condition through these testing
methodologies.

The following are the testing objectives for the engine condition and the battery RUL .

Prediction correctness: By contrasting the predicted values with the actual conditions and
RUL values from the test dataset, assess the correctness of the battery RUL and engine

BMSIT & M, Dept of CSE 2023-2024 36

Predictive Maintenance for Automobiles Testing
condition prediction models. This guarantees precise and dependable forecasts from the
models.

Model Generalization: Evaluate the models' capacity for generalization by putting them to the
test on unobserved data. To check if the models can accurately forecast fresh, unseen data,
divide the dataset into training and test sets. Then, assess the models' performance on the test
set.

Robustness Testing: Determine whether the models are resilient by varying or adding noise to
the input data and assessing how well they work. This helps ascertain whether the models'
predicted accuracy can be considerably affected by changes in ambient factors, sensor readings,
or battery usage habits.

Metrics of Performance: To assess the efficacy of the models and pinpoint areas for
development, compute and examine performance metrics such as accuracy, precision, recall,
F1-score, mean squared error (MSE), root mean square error (RMSE), and mean absolute error
(M AE).

Comparative Analysis: To determine which machine learning algorithm or technique

performs best for predicting engine condition and battery RUL, compare its performance with
other methods. Consider the models' scalability, computational efficiency, and forecast
accuracy when evaluating them.AE).

6.3 Testing Phases

The systematic processes or phases involved in assessing the functionality, quality, and
performance of a software system or application are generally referred to as testing phases.
These stages make sure the software satisfies the requirements, performs well, and offers users
the benefit that was designed.

Unit Testing: In order to make sure the software works as intended, each function, method, or
module is tested separately at this phase. The primary goal of unit tests is to confirm that brief,
discrete segments of code are valid. To verify that each unit functions independently of the rest
of the system, developers create unit tests. This guarantees that every component functions as
intended and aids in the early detection of bugs during the development process.

BMSIT & M, Dept of CSE 2023-2024 37

Predictive Maintenance for Automobiles Testing
Unit testing is the process of testing individual software modules or functions, such as those
that forecast engine status and battery remaining useful life (RUL). To make sure it functions as
intended, each unit—including the prediction algorithms and data pretreatment steps—is tested
separately. Unit tests are created by developers to independently verify that each prediction
model or data transformation technique is accurate and to identify issues early in the
development cycle.

System Testing: A critical stage of the software development lifecycle is system testing,
during which the system as a whole is assessed to make sure it satisfies requirements and
operates as intended. During this testing step, the behavior of the system as a whole is
evaluated instead of that of its constituent parts or units. Testing the integrated system's
interfaces, data flows, and interactions with other people or systems is known as system testing.

System testing assesses how well the complete predictive system functions and behaves in
terms of predicting engine status and battery RUL. In order to make sure that all of the parts—
predictive models, user interfaces, and data preprocessing—integrate as planned, testing is done
at this point. System tests verify that the program satisfies predetermined requirements and
operates correctly in a range of scenarios by examining end-to-end scenarios and user
interactions. During this testing step, the behavior of the system is verified holistically to make
sure it satisfies user expectations and makes accurate predictions.

Integration Testing: A vital phase of developing software is integration testing, which verifies
the smooth operation and interface between separate units or components. Its main objective is
to make sure that the integrated components function as a whole, identifying problems with
data transfer, communication protocols, and integration points. Integration testing uses several
techniques, such as top-down or bottom-up testing, to identify bugs in software connections
early on, reducing risks and making sure the program satisfies both functional as well as non-
functional criteria.

This process ensures smooth operation and accurate data interchange by confirming the
interactions and data flow between various software components. The integration of different
algorithms, preliminary data processing procedures, and other modules involved in predicting

engine status and battery RUL would be validated by integration testing in our project. In order
to verify the overall functionality of our predictive system, these tests concentrate on
identifying any problems, such as inappropriate designs or communication faults between
various elements of the system.

BMSIT & M, Dept of CSE 2023-2024 38

Predictive Maintenance for Automobiles Testing
Regression Testing: Regression testing is an important element of software development
because it ensures that new alterations or modifications to code do not cause undesired side
effects or regressions in current features. Regression testing ensures that the product performs
as intended following updates, additions, or bug patches by rerunning previously performed test
cases. It contributes to the stability and dependability of the product by recognizing and
addressing any issues that may develop as a result of code changes, ensuring that the software
stays consistent and functioning throughout iterations.

Regression testing verifies that recent software upgrades or updates have not caused any
unforeseen side effects or regressions in functionality. In our project, regression testing is
rerunning previously conducted test cases to identify any differences in anticipated engine
condition or battery RUL induced by recent code modifications. This testing process
contributes to the stability and reliability of the predictive system by discovering and dealing
with any issues that arise during development or upgrades, to make sure the software maintains
to work as planned throughout time.

Performance Testing: Performance testing is an important component of software

development that evaluates an application's responsiveness, scalability, and stability under
different load situations. Its goal is to evaluate the software's effectiveness in terms of speed,
performance, and resource consumption, as well as to discover any performance bottlenecks or
issues that may have an influence on the user experience. Performance testing ensures that the
program meets requirements for performance and can deal with predicted workloads
efficiently, hence enhancing its overall efficiency and user satisfaction.
Performance testing evaluates the predictive system's responsiveness, scalability, and efficiency
under varying load situations. In this instance, performance testing assesses the system's
capacity to handle a variety of prediction requests efficiently and consistently. This testing
phase assesses characteristics such as response times, throughput, and resource consumption in
order to discover any bottlenecks or performance concerns that may affect the system's
efficiency in production. Performance testing guarantees ensure the predictive system is
capable of effectively handling predicted workloads while also meeting performance criteria for
fast and accurate forecasts.

6.4 Test Cases

Test cases are critical components of the validation process in a machine learning project aimed
at predicting engine health and estimating battery usable life. These scenarios are precisely
created to objectively evaluate the performance and usefulness of the produced models. Each
test case describes a specific scenario or condition that the models must correctly handle. For
example, in engine condition prediction, test cases may include determining the model's ability
to identify engines as either positive or negative based on data from sensors inputs. Similarly,
BMSIT & M, Dept of CSE 2023-2024 39
Predictive Maintenance for Automobiles Testing
for battery remaining usable life prediction, test cases could include determining a model's
accuracy in projecting battery longevity under varied usage scenarios.
These test cases cover a variety of circumstances to ensure a thorough review. They incorporate
various input data, expected outputs or results, and specified criteria for evaluating the model's
performance. Through the systematic execution of these test cases, we can ensure that machine
learning algorithms coordinate with project objectives, are robust across multiple conditions,
and provide credible predictions. Finally, the extensive validation offered by test cases
improves the accuracy and efficacy of machine learning methods for predictive maintenance
activities.
• Data pre-processing: It involves removing null values, or missing values and splitting the
data. The intended output is data preprocessing success, and the observed output is the same,
indicating that the final outcome is successful.

Table 6.4.1: Test cases for data pre processing

Test Case Sl. No 1

Test Name Data pre processing

Test Feature It removes null or missing values.

Output Expected Dataset preprocessing successful

Output Obtained Dataset preprocessing is successful

Result Successful

• Model Creation: Creates models using algorithms and datasets. The expected
output is a successfully developed model with a successful outcome.
Table 6.4.2: Test cases for Model creation
Test Case Sl. No 2
Test Name Model creation

Test Feature Create the model based on algorithm

using dataset.

BMSIT & M, Dept of CSE 2023-2024 40

Predictive Maintenance for Automobiles Testing
Output Expected Model created successful

Output Obtained Model created successful

Result Successful

• Feature Engineering: Once the feature engineering process is successfully

completed, the dataset is enhanced with new attributes. The expected output is
to successfully add new features and the observed output is the same.
Table 6.4.3: Test cases for Feature engineering

Test Case Sl. 3

Test Name Feature Engineering

Once the feature engineering process is

Test Feature successfully completed, the dataset is enhanced
with new attributes. We can observe improved
correlations and distributions in the dataset.

Output Successfully engineered features will be integrated

Expected into the dataset.

Output Successfully engineered features are integrated

Obtained into the dataset.

Result Successful

• Engine Health and Battery RUL Prediction: Predicts the engine health and
Remaining Useful Life (RUL) of the battery based on the trained models. The
expected outcome is to evaluate the health status and RUL of the battery and the
outcome is successful.
Table 6.4.4: Test cases for Engine health and Battery RUL prediction
Test Case Sl. 4
No
Test Name Engine Health and Battery RUL Prediction

BMSIT & M, Dept of CSE 2023-2024 41

Predictive Maintenance for Automobiles Testing
Predicts the engine health and Remaining Useful
Test Feature Life (RUL) of the battery based on the trained
models. Once the prediction is made, evaluate
the health status and RUL of the battery.

Output Successfully predicts the engine health and

Expected battery RUL.
Output Successfully predicts the engine health and
Obtained battery RUL.

Result Successful

BMSIT & M, Dept of CSE 2023-2024 42

Predictive Maintenance for Automobiles Experiment Results

CHAPTER 7

EXPERIMENT RESULTS

Sl. Cycle Discha Decrement Max. Min. Time at Time Charging RUL
no Index rge 9.9-9.35V Voltage Voltage 11.41V (s) constant time(s)
Time (s) Discharg. Charg. current
(s) (V) (V) (s)
0 1 7138.11 3167.05 10.093 8.83 15017.19 18578.98 29643.32 3058.44

1 2 20376.7 3224.87 11.678 8.85 15151.93 18598.26 28880.17 3055.69

3
2 3 20335.8 3061.17 11.68 8.86 15151.94 18598.26 28660.22 3052.94
0
3 4 20313.0 2971.31 11.68 8.87 15132.75 18598.26 28391.86 3050.19
87
4 5 178838. 81999.04 11.79 9.34 15074.92 146358.57 155946.77 3044.69
64

Fig 7.1: Snapshot of the battery dataset

Cycle_ Discharg Decremen Max. Min. Time at Time Charging RUL

Sl.n Index e Time t 9.9- Volta Voltag 11.41V constant time(s)
o (s) 9.35V (s) ge e (s) current
Disch Charg. (s)
ar. (V) (V)
Count 15063.00 1.50e+04 1.50e+04 15063.0 15063.00 15063.00 1.50e+04 1.50e+04 15063.0
0 0
mean 556.11 1.24e+04 3.38e+03 10.74 9.84 10356.05 1.49e+04 2.75e+04 1524.35

std 322.35 8.99+04 4.12e+04 0.25 0.34 25089.67 6.78e+04 7.14e+04 886.76

min 1.00 2.39e+01 -1.09e+06 8.36 8.31 -312.40 1.64e+01 1.64e+01 0.00

25% 271.00 3.21e+03 8.79e+02 10.57 9.59 5030.16 7.05e+03 2.15e+04 761.86

50% 560.00 4.28e+03 1.20e+03 10.74 9.82 8059.19 1.05+04 2.28e+04 1515.47

75% 833.00 5.24e+03 1.65e+03 10.92 10.07 11244.51 1.37e+04 2.41e+04 2307.58

max 1134.00 2.63e+06 1.11e+06 12.00 12.04 674126.38 2.42e+06 2.42e+06 3116.20

Data for various battery operation cycles, such as discharge duration, voltage decreases, and
residual usable life (RUL), are shown in fig 7.1.With respect to time constants, charging time,
maximum and minimum voltage, and other characteristics, each row denotes a cycle. The
information sheds light on battery depreciation and performance across several cycles.

Fig 7.2: Statistics of the battery dataset

The overview of data for several battery performance characteristics, such as cycle index,
discharge duration, voltage decreases, and remaining usable life (RUL), is shown in Fig. 7.2. It

BMSIT & M, Dept of CSE 2023-2024 43

Predictive Maintenance for Automobiles Experiment Results
provides information on the distribution and variability of battery data by displaying the count,
mean, and standard deviation for each parameter. The statistical analysis indicates a broad
range of results for several parameters, which suggests variations in the performance of the
battery and possible outliers within the dataset.

Sl no Algorithms MSE RMSE TIME

1 Bagging Regressor 4443.04 66.66 0.72

2 Random Forest Regressor 4703.35 68.58 6.97

3 Kneighbors Regressor 6606.19 81.28 0.14

4 Extra Tree Regressor 8890.75 94.29 0.070

5 Decision Tree Regressor 9119.10 95.50 0.16

6 Gradient Boosting 16757.18 129.45 3.27

Regressor

7 AdaBoost Regressor 31168.30 176.55 0.96

8 SVR 104628.06 323.46 9.87

9 AdaBoost Regressor 176805.94 420.48 0.04

10 Linear Regression 177147.79 420.889 0.04

Fig 7.3: Regression Results in Algorithms

The performance of several regression algorithms is compiled in Fig. 7.3 according to

execution time, mean squared error (MSE), and root mean square error (RMSE). When
compared to other algorithms, Random Forest Regressor and Bagging Regressor had the lowest
MSE and RMSE, indicating superior prediction accuracy. Out of all the algorithms, Bagging
Regressor took the shortest time to execute.

Fig 7.4: Bagging Regressor

BMSIT & M, Dept of CSE 2023-2024 44

Predictive Maintenance for Automobiles Experiment Results

Using a combination of base estimators, the Bagging Regressor is an ensemble learning

strategy that enhances prediction performance. To lower variance and improve robustness, it
trains each base estimator on a different subset of the training data and averages their
predictions. Among the examined regression methods, Bagging Regressor demonstrated
competitive performance by achieving the lowest mean squared error and root mean squared

error.

Fig 7.5: Random Forest Regressor

Sl. no Engine Lub oil Fuel Coolant Lub oil Coolant Engine
rpm pressure pressure pressure temp temp Condition

0 700 2.49 11.79 3.17 84.14 81.63 1

1 876 2.94 16.19 2.46 77.64 82.44 0

2 520 2.96 6.55 1.06 77.75 79.64 1

3 473 3.70 19.51 3.72 74.12 71.77 1

4 619 5.67 15.73 2.05 78.39 87.00 0

The Random Forest Regressor is an ensemble learning system that utilizes decision trees. It
involves training numerous trees on arbitrary subsets of data and averaging their predictions. It
is renowned for its extreme adaptability, resilience to overfitting, and capacity to work with big,
highly dimensional datasets.

Fig 7.6: Snapshot of Engine dataset

BMSIT & M, Dept of CSE 2023-2024 45
Predictive Maintenance for Automobiles Experiment Results
The engine dataset, shown in fig. 7.6, offers data from sensors that track important engine
operating characteristics, such as engine rpm, coolant pressure, fuel pressure, lubricating oil
pressure, coolant temperature, and engine health. Every row corresponds to a distinct
measurement instance that was probably obtained at various points in time while the engine
was running.

Engine Lub oil Fuel Coolant Lub oil Coolant Engine

rpm pressure pressure pressur temp temp Condition
Sl.no
e
Count 19535.00 19535.00 19535.00 19535.00 19535.00 19535.00 19535.00

mean 791.23 3.30 6.65 2.33 77.64 78.42 0.63

std 267.61 1.02 2.76 1.03 3.11 6.20 0.48

min 61.00 0.001 0.003 0.002 71.32 61.67 0.00

25% 593.00 2.51 4.91 1.60 75.72 73.89 0.00

50% 746.00 3.16 6.20 2.16 76.81 78.34 1.00

75% 934.00 4.055 7.74 2.84 78.07 82.91 1.00

max 2239.00 7.265 21.13 7.47 89.58 195.52 1.00

Fig 7.7: Statistics of engine dataset

The fig 7.7 summarise the statistics of engine dataset. The dataset's seven variables—Engine
rpm, Lub oil pressure, Fuel pressure, Coolant pressure, Lub oil temperature, Coolant
temperature, and Engine Condition—are summarized in this summary statistics table. For every
variable, it provides the following information: count, mean, standard deviation, minimum,
25th percentile (Q1), median (50th percentile), 75th percentile (Q3), and maximum. These
statistics provide information about the distribution, variability, and central tendency of the
data.

Engine Lub oil Fuel Coolant Lub Coolant Engine Coolant Oil
rpm pressur pressure pressure oil temp Condition Efficiency Efficiency
e temp
0 700 2.49 11.79 3.17 84.14 81.63 1 0.11 0.000017

1 876 2.94 16.19 2.46 77.64 82.44 0 0.09 0.000015

2 520 2.96 6.55 1.06 77.75 79.64 1 0.15 0.000025

BMSIT & M, Dept of CSE 2023-2024 46

Predictive Maintenance for Automobiles Experiment Results
3 473 3.70 19.51 3.72 74.12 71.77 1 0.15 0.000029

4 619 5.67 15.73 2.05 78.39 87.00 0 0.14 0.000021

Fig 7.8: Feature Engineering

The feature engineering with the two additional columns is shown in fig. 7.8.To compute the
Coolant Efficiency column, divide 1 by the "Engine rpm" and multiply the resulting number by
the Coolant temperature column. To calculate the "Oil Efficiency" column, divide 1 by the sum
of the "Engine rpm" and "lub oil temp" columns. These computed efficiency metrics offer more
information about how well the coolant and lubricating oil systems work to support engine
operation and maintain proper temperatures.

Fig 7.9: Random Forest Classifier

This code initializes a RandomForestClassifier model using its default settings. The features
(X_train) and matching target labels (y_train) are then used to train it. Following training, the
model is applied to predict the target labels for the validation set (X_test), and the difference
between the predictions and the actual labels (y_test) yields the accuracy score. Lastly, a
performance evaluation of the trained model on the validation set is provided using the
accuracy score.

BMSIT & M, Dept of CSE 2023-2024 47

Predictive Maintenance for Automobiles Experiment Results

Fig 7.10: XGBoost

An XGBoost model is first trained with default settings in this snippet of code, and its accuracy
is assessed on a test set. GridSearchCV is then used to search over a predetermined grid of
hyperparameters in order to optimize the model's performance through hyperparameter tuning.
After determining the optimal hyperparameters, a new XGBoost model is trained with them.
Lastly, the tuned model's accuracy is evaluated on the test set, offering an understanding of the
enhancement made possible by hyperparameter optimization.

BMSIT & M, Dept of CSE 2023-2024 48

Predictive Maintenance for Automobiles Conclusion

CHAPTER 7
CONCLUSION

Through the use of machine learning and data-driven methodologies, the predictive
maintenance project for cars seeks to transform the automotive sector. The project improves
vehicle performance, safety, and reliability by predicting potential failures and maintenance
needs by analysing sensor data from engine and battery systems.

In order to predict engine health, the system examines variables like engine RPM, fuel pressure,
lubricating oil pressure, and temperature. It then looks for anomalies and sends out early
warnings for maintenance or repairs. In contrast, the battery management system data is used
by the system to predict the battery's remaining useful life (RUL) and plan proactive
maintenance, which guarantees optimal battery performance and dependability.

A methodical approach is employed in the project, which includes phases for data collection,
preprocessing, feature engineering, model selection, training, and evaluation. Several machine
learning algorithms, including XGBoost, Random Forest, Logistic Regression, and Bagging
Regressor, are utilised in the development of precise predictive models for battery RUL
prediction and engine health prediction.

The automotive sector stands to gain from the application of these predictive maintenance
models in the form of decreased repair costs, decreased downtime, and enhanced driver and
passenger safety. The project lays the groundwork for an ecosystem of automotive maintenance
that is more dependable and efficient by utilising data analytics and machine learning.

BMSIT & M, Dept of CSE 2023-2024 49

Predictive Maintenance for Automobiles Future Enhancements

CHAPTER 8
FUTURE ENHANCEMENTS

Predictive Analytics Dashboard: By creating an intuitive dashboard interface, automotive

stakeholders will be able to monitor and manage predictive maintenance tasks more effectively.
Proactive maintenance planning is made possible by a dashboard's ability to provide
visualisations, alerts, and recommendations based on predictive insights.

Predictive maintenance automation: It can reduce manual labour and streamline maintenance
workflows by implementing automated maintenance scheduling and intervention systems based
on predictive models. Optimising operational efficiency and cost-effectiveness can be achieved
through integration with current automotive maintenance systems and procedures.

Long-Term Performance Monitoring: Putting in place long-term monitoring systems to keep

tabs on the efficacy and performance of predictive maintenance plans over time can yield
important information for ongoing development. Iterative improvements can be informed by
analysing historical maintenance data and contrasting it with outcomes from predictive
maintenance.

Extension to Fleet Management: Logistics firms and operators of commercial vehicles may
profit from the addition of predictive maintenance features to fleet management software. By
maximising vehicle utilisation, reducing downtime, and optimising fleet operations, predictive
maintenance models can reduce costs and raise customer satisfaction.

BMSIT & M, Dept of CSE 2023-2024 50

Predictive Maintenance for Automobiles

REFERENCES

[1] Andreas Theissler , Judith Pérez-Velázquez, Marcel Kettelgerdes, Gordon Elger Elsevier-
Reliability Engineering & System Safety (2021) Predictive maintenance enabled by machine
learning: Use cases and challenges in the automotive industry.

[2] I. Fabio Arena , Mario Collotta , Liliana Luca , Marianna Ruggieri and Francesco Gaetano
Termine Statistical and Stochastic Approaches for Predictive Maintenance in the Context of
Industry 4.0 (2021) Predictive Maintenance in the Automotive

[3] Iron Tessaro,Viviana Cocco Mariani and Leandro dos Santos Coelho First International
Electronic Conference on Actuator Technology: Materials, Devices and Applications(2020)
Machine Learning Models Applied to Predictive Maintenance in Automotive Engine
Components.

[4] Tiwari, A., Varshini, C. R. A., Jha, A., Annamalai, K. R., Deepa, K., & Sailaja, V. (2023).
Use of ML Techniques for Li-Ion Battery Remaining Useful Life Prediction-A Survey. In 2023
Fifth International Conference on Electrical, Computer and Communication Technologies
(ICECCT).

[5] Li, Z., Shi, Q., Xia, J., Wang, K., & Jiang, K. (2023). Novel Method Based on Stacking
Model for Remaining Useful Life Prediction of Lithium-ion Batteries. In 2023 26th
International Conference on Electrical Machines and Systems (ICEMS).

[6] Iron Tessaro,Viviana Cocco Mariani and Leandro dos Santos Coelho First International
Electronic Conference on Actuator Technology: Materials, Devices and Applications(2020)
Machine Learning Models Applied to Predictive Maintenance in Automotive Engine
Components.

[7] Danilo Giordano, Flavio Giobergia, Eliana Pastor, Antonio La Macchia, Tania Cerquitelli
Elsevier: Computers in Industry (2021) Data-driven strategies for predictive maintenance:
Lesson learned from automotive use case.

[8] Wu, Z, Jia, J, Liu, Y, Qi, Q, Yin, L, & Xiao, W. (2022). Prediction of Battery Remaining
Useful Life Based on Multi-dimensional Features and Machine Learning. In 2022 4th
International Conference on Smart Power & Internet Energy Systems (SPIES).

BMSIT & M, Dept of CSE 2023-2024 51

Predictive Maintenance for Automobiles

[9] Aydin, O., & Guldamlasioglu, S. (2017). Using LSTM networks to predict engine
condition on large scale data processing framework. In 2017 4th International Conference on
Electrical and Electronic Engineering (ICEEE).

[10] Sahasrabudhe, N., Asegaonkar, R., Deo, S., Umredkar, S., & Mundada, K. (2020).
Experimental Analysis of Machine Learning Algorithms used in Predictive Maintenance. In
2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT)
IEEE.

[11] Shahid A. Hasib, S. Islam, Ripon K. Chakrabortty, Michael J. Ryan, D. K. Saha, Md H.

Ahamed, S. I. Moye, "A Comprehensive Review of Available Battery Datasets, RUL
Prediction Approaches, and Advanced Battery Management," in IEEE

[12] S. Jafari and Y.-C. Byun, "Optimizing Battery RUL Prediction of Lithium-Ion Batteries
Based on Harris Hawk Optimization Approach Using Random Forest and LightGBM," in IEEE

[13] J. Hu, Y. Lu, and B. Lin, "RUL Prediction for Lithium-ion Batteries Using Combination
Forecasting based on SVR and LSTM," in 2021 China Automation Congress (CAC), IEEE

[14] Gou, B., Xu, Y., & Feng, X. (2020). State-of-Health Estimation and Remaining-Useful-
Life Prediction for Lithium-Ion Battery Using a Hybrid Data-Driven Method. IEE

[15] Wu, Y., Li, W., Wang, Y., & Zhang, K. (2019). Remaining Useful Life Prediction of
Lithium-Ion Batteries Using Neural Network and Bat-Based Particle Filter. IEEE

BMSIT & M, Dept of CSE 2023-2024 52

Predictive Maintenance for Automobiles

Vehicle Service Management For Modern Day Garage
No ratings yet
Vehicle Service Management For Modern Day Garage
74 pages
SR 11-7, Validation and Machine Learning Models
No ratings yet
SR 11-7, Validation and Machine Learning Models
31 pages
MAJOR_DOC1[1][1] (1)
No ratings yet
MAJOR_DOC1[1][1] (1)
70 pages
Final Report
No ratings yet
Final Report
23 pages
Project phaseII Report
No ratings yet
Project phaseII Report
42 pages
Phase1 Report[1]
No ratings yet
Phase1 Report[1]
18 pages
Project report
No ratings yet
Project report
70 pages
An Anaya
No ratings yet
An Anaya
40 pages
KGiSL Institute of Technolog(Final) - Copy (2)
No ratings yet
KGiSL Institute of Technolog(Final) - Copy (2)
31 pages
Machine Learning Applications in Predictive Maintenance for Vehicles Case Studies
No ratings yet
Machine Learning Applications in Predictive Maintenance for Vehicles Case Studies
14 pages
Subash's Final project
No ratings yet
Subash's Final project
67 pages
data mining report
No ratings yet
data mining report
25 pages
Machine Learning Based Car Price Prediction System
No ratings yet
Machine Learning Based Car Price Prediction System
32 pages
technical semi
No ratings yet
technical semi
37 pages
Intelligent_Vehicle_Support_
No ratings yet
Intelligent_Vehicle_Support_
35 pages
2023DA0451DISSERTATION ABSTRACTT (1) (3)
No ratings yet
2023DA0451DISSERTATION ABSTRACTT (1) (3)
9 pages
AI-Powered+Predictive+Analytics+for+Vehicle+Maintenance+Scheduling (1)
No ratings yet
AI-Powered+Predictive+Analytics+for+Vehicle+Maintenance+Scheduling (1)
16 pages
Intership
No ratings yet
Intership
23 pages
Dbms Final
No ratings yet
Dbms Final
85 pages
empowering small companies with automated sales forecasting
No ratings yet
empowering small companies with automated sales forecasting
66 pages
Used Car Price Prediction Using Machine Learning: Veluru Ranjith (Urk18Cs020)
No ratings yet
Used Car Price Prediction Using Machine Learning: Veluru Ranjith (Urk18Cs020)
26 pages
ProjectReport final project
No ratings yet
ProjectReport final project
38 pages
Seminar Report 2023
No ratings yet
Seminar Report 2023
13 pages
Final Document Recent f2
No ratings yet
Final Document Recent f2
56 pages
ME PHARSE-I FRONT
No ratings yet
ME PHARSE-I FRONT
5 pages
Vishal New
No ratings yet
Vishal New
5 pages
Me Pharse-i Front
No ratings yet
Me Pharse-i Front
5 pages
Ai and Ml-Based Predictive Maintenance of Motors Using Raspberry Pi
100% (1)
Ai and Ml-Based Predictive Maintenance of Motors Using Raspberry Pi
5 pages
Predicting True Value of Cars Using Ml-1
No ratings yet
Predicting True Value of Cars Using Ml-1
36 pages
Minor Project RRR
No ratings yet
Minor Project RRR
24 pages
Phishing Website Detection Using Machine Learning
No ratings yet
Phishing Website Detection Using Machine Learning
31 pages
fnt
No ratings yet
fnt
10 pages
OBJECT DETECTION Using LIDAR ABSCTRACT
100% (1)
OBJECT DETECTION Using LIDAR ABSCTRACT
4 pages
1VISVESVARAYA TECHNOLOGICAL UNIVERSITY
No ratings yet
1VISVESVARAYA TECHNOLOGICAL UNIVERSITY
29 pages
Front Car
No ratings yet
Front Car
8 pages
Presentaion Project
No ratings yet
Presentaion Project
15 pages
B.TECH PROJECT (5)
No ratings yet
B.TECH PROJECT (5)
14 pages
AMRUTHA FT
No ratings yet
AMRUTHA FT
4 pages
Vehicle Count Prediction
100% (2)
Vehicle Count Prediction
33 pages
Cse-F Batch8 Finaldoc
No ratings yet
Cse-F Batch8 Finaldoc
81 pages
Ele 02 Ncrenb
No ratings yet
Ele 02 Ncrenb
5 pages
Government Polytechnic, Solapur: Vehicle Maintenance and Servicing System
No ratings yet
Government Polytechnic, Solapur: Vehicle Maintenance and Servicing System
7 pages
Cse Major Project Progress Report
No ratings yet
Cse Major Project Progress Report
17 pages
Report Initial Pages
No ratings yet
Report Initial Pages
11 pages
Predictive Maintenence
No ratings yet
Predictive Maintenence
1 page
Project Submitted To The Bharathiar University in Partial Fulfillment of The Requirements For The Award of The Degree of
No ratings yet
Project Submitted To The Bharathiar University in Partial Fulfillment of The Requirements For The Award of The Degree of
5 pages
Phase 1 Review 1 Presentation Format 23 24 (1) 1[1] [Read Only] (1)
No ratings yet
Phase 1 Review 1 Presentation Format 23 24 (1) 1[1] [Read Only] (1)
20 pages
17BIT202
No ratings yet
17BIT202
25 pages
Main Project
No ratings yet
Main Project
43 pages
Predictive Maintenance in Industrial Systems Using Machine Learning
No ratings yet
Predictive Maintenance in Industrial Systems Using Machine Learning
8 pages
MAJOR_AND_MINOR_PROJECT_REPORT_FORMAT_niist[1]
No ratings yet
MAJOR_AND_MINOR_PROJECT_REPORT_FORMAT_niist[1]
9 pages
ff
No ratings yet
ff
14 pages
BLACKBOOK
No ratings yet
BLACKBOOK
33 pages
Automobile Predictive Maintenance Using
No ratings yet
Automobile Predictive Maintenance Using
12 pages
Final Project Report
No ratings yet
Final Project Report
39 pages
Machine Learning in Predictive Maintenance: Advancements, Challenges, and Future Directions
100% (2)
Machine Learning in Predictive Maintenance: Advancements, Challenges, and Future Directions
7 pages
INSTAGRAM AUTOMATION
No ratings yet
INSTAGRAM AUTOMATION
4 pages
Binder 1
No ratings yet
Binder 1
93 pages
Development of a Predictive Maintenance Algorithm for a Diesel Generator using Machine Learning
No ratings yet
Development of a Predictive Maintenance Algorithm for a Diesel Generator using Machine Learning
11 pages
Hosseini S
No ratings yet
Hosseini S
99 pages
Training Facility Norms and Standard Equipment Lists: Volume 2---Mechatronics Technology
From Everand
Training Facility Norms and Standard Equipment Lists: Volume 2---Mechatronics Technology
Fook Yen Chong
No ratings yet
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
ML Mod-4
No ratings yet
ML Mod-4
30 pages
Machine Learning Based Advanced Crime Prediction and Analysis
No ratings yet
Machine Learning Based Advanced Crime Prediction and Analysis
7 pages
A Smart Recommendation System For Carrier Shipper Matching Using Multilabel Classification - A Survey
No ratings yet
A Smart Recommendation System For Carrier Shipper Matching Using Multilabel Classification - A Survey
5 pages
Machine Learning A Review On Binary Classification
No ratings yet
Machine Learning A Review On Binary Classification
5 pages
Pattern Recognition Unit 1,2
No ratings yet
Pattern Recognition Unit 1,2
82 pages
Data Science - Decision Tree - Random Forest
No ratings yet
Data Science - Decision Tree - Random Forest
15 pages
1 s2.0 S0264999313004318 Main
No ratings yet
1 s2.0 S0264999313004318 Main
9 pages
Nptel Week 7
No ratings yet
Nptel Week 7
3 pages
DSA5102_lecture3
No ratings yet
DSA5102_lecture3
34 pages
Fake Job Post Detection Using Machine Learning
100% (1)
Fake Job Post Detection Using Machine Learning
24 pages
Diabetes Mellitus Prediction and Diagnosis 2022
No ratings yet
Diabetes Mellitus Prediction and Diagnosis 2022
12 pages
Weighted Ensemble Model For Image Classification: Talib Iqball M. Arif Wani
No ratings yet
Weighted Ensemble Model For Image Classification: Talib Iqball M. Arif Wani
8 pages
Doan Uccs 0892D 10279
No ratings yet
Doan Uccs 0892D 10279
147 pages
Ensemble Methods_ Bagging, Boosting and Stacking _ by Joseph Rocca _ Towards Data Science
No ratings yet
Ensemble Methods_ Bagging, Boosting and Stacking _ by Joseph Rocca _ Towards Data Science
20 pages
QSAR Co Manual
No ratings yet
QSAR Co Manual
29 pages
IAI&ML UNIT-5
No ratings yet
IAI&ML UNIT-5
15 pages
Firms Profitability and ESG Score A Machine Learning Approach
No ratings yet
Firms Profitability and ESG Score A Machine Learning Approach
19 pages
A Clinical Decision Support System For Heart Disease Prediction Using Deep Learning
No ratings yet
A Clinical Decision Support System For Heart Disease Prediction Using Deep Learning
14 pages
Improving Regressors Using Boosting Techniques: Observations, XX
No ratings yet
Improving Regressors Using Boosting Techniques: Observations, XX
9 pages
Project Report Gr-12
No ratings yet
Project Report Gr-12
25 pages
Pradeep (Aiml)
No ratings yet
Pradeep (Aiml)
31 pages
Project Questions
No ratings yet
Project Questions
3 pages
Agriculture 12 02137 v2
No ratings yet
Agriculture 12 02137 v2
23 pages
Ajanah, Hakeema Ize Final Project
No ratings yet
Ajanah, Hakeema Ize Final Project
97 pages
Two Marks - Aiml
No ratings yet
Two Marks - Aiml
21 pages
Computer Technology and Probable Job Destructions in Japan - An Evaluation
No ratings yet
Computer Technology and Probable Job Destructions in Japan - An Evaluation
11 pages
AD3S
No ratings yet
AD3S
6 pages
Previous Year Placement Questions of ISI KOLKATA
No ratings yet
Previous Year Placement Questions of ISI KOLKATA
9 pages