Research Paper
Research Paper
C. LOGISTIC REGRESSION
A popular statistical technique for modeling binary outcome variables in a variety of
domains, including machine learning, social sciences, and medicine, is logistic regression. In
contrast to linear regression, which forecasts a continuous result, logistic regression estimates
the likelihood of a binary result, which has only two possible values, typically denoted as 0
and 1. The logistic function, commonly referred to as the sigmoid function, is the
fundamental component of logistic regression. The anticipated values are mapped to
probabilities using this function.A logistic regression model forecasts the likelihood that an
instance falls into a specific class. Maximum likelihood estimation is the approach used to
estimate the logistic regression model's coefficients (MLE). Maximizing the likelihood of the
observed data is the aim of Maximum Likelihood Estimation (MLE) techniques. The log
chances of the dependent variable are represented by the coefficients in logistic regression.
The ratio of the chance of an event happening to the likelihood of it not happening is known
as the odds of an event occurring.
The reasons to use linear regression in our research approach are firstly clear probabilistic
predictions are produced by logistic regression. Maintenance engineers and decision-makers
may simply analyse the output probabilities to determine the probability of failure. Given its
ease of implementation and processing efficiency, logistic regression is a useful tool for
handling the enormous datasets frequently seen in predictive maintenance scenarios. Its
efficiency allows for real-time or near-real-time predictions, enabling timely maintenance
actions. Given its ease of implementation and processing efficiency, logistic regression is a
useful tool for handling the enormous datasets frequently seen in predictive maintenance
scenarios. Based on the estimated chance of failure, risk ratings or thresholds that can initiate
maintenance procedures can be created using the probabilistic output of logistic regression.
This adaptability is useful when scheduling maintenance.
Figure 3: Confusion matrix for Logistic regression
3. Conclusion
We analyzed five machine learning algorithms in detail as part of our extensive investigation
into predictive maintenance models: k-Nearest Neighbors (KNN), Support Vector Machines
(SVM), Logistic Regression, Deep Neural Networks (DNN), and Light Gradient Boosting
Machine (LightGBM). Our goal was to identify the best performing model for equipment
failure prediction and to optimize maintenance schedules according to the model's accuracy
performances. First, we carefully assembled data from multiple sources, including sensor
readings, operation logs, and maintenance records, before preprocessing the dataset. By
means of feature encoding and normalization, we made certain that the data was consistent
and compatible with all models. Each model was then put through a rigorous training process
using a standardized dataset, and its hyperparameters were carefully adjusted to get the best
possible performance. The outcomes of our review method were informative. With an
accuracy of 62.34%, KNN demonstrated its ease of use, sensitivity to parameter selection,
and inefficiency while working with larger datasets. Even with its capacity for high-
dimensional spaces and variable kernel choices, SVM only produced an accuracy of 59.1%,
suggesting that model complexity and parameter complexities may be the cause of its
limitations. The accuracy of Logistic Regression was 69.21%, which was a marginal
improvement. Although it provided a trade-off between interpretability and performance,
more advanced models were able to capture complex data patterns better than it did. With an
accuracy of 86%, Deep Neural Networks (DNN) showed itself to be a formidable competitor,
demonstrating its capacity to learn intricate associations from large amounts of data, even if it
requires more processing power. In our comparative investigation, LightGBM emerged as the
obvious winner with an accuracy of 94%. Its leaf-wise tree growth technique, histogram-
based algorithm, and reliable handling of large-scale, high-dimensional datasets are all
responsible for its remarkable performance. LightGBM is a great option for predictive
maintenance activities because of its efficiency, scalability, and superior tree-based learning
technique, which greatly increase the dependability and affordability of maintenance
operations.
In conclusion, while easier to understand models such as Logistic Regression, KNN, and
SVM are available, more advanced algorithms have proven to be more predictive than these
simpler models. The two best performances were DNN and LightGBM, with LightGBM
showing the highest accuracy and efficiency. LightGBM is the preferred model for predictive
maintenance applications due to its outstanding performance, which guarantees prompt and
proactive maintenance interventions to reduce equipment failures and maximize maintenance
schedules.