0% found this document useful (0 votes)
17 views10 pages

An Ensemble Learning Model For Failure Rate Predic - 2020 - Procedia Manufacturi

This paper presents an ensemble-learning model designed to predict the failure rate of equipment, specifically centrifugal pumps in an oil refinery, by analyzing maintenance data. The model aims to enhance preventive maintenance strategies by identifying key operating parameters that influence failure rates. A case study involving 143 pumps demonstrates the model's effectiveness in estimating failure rates under varying conditions.

Uploaded by

Anoop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views10 pages

An Ensemble Learning Model For Failure Rate Predic - 2020 - Procedia Manufacturi

This paper presents an ensemble-learning model designed to predict the failure rate of equipment, specifically centrifugal pumps in an oil refinery, by analyzing maintenance data. The model aims to enhance preventive maintenance strategies by identifying key operating parameters that influence failure rates. A case study involving 143 pumps demonstrates the model's effectiveness in estimating failure rates under varying conditions.

Uploaded by

Anoop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Available online at www.sciencedirect.

com

ScienceDirect
Procedia Manufacturing 42 (2020) 41–48

International Conference on Industry 4.0 and Smart Manufacturing (ISM 2019)

An ensemble-learning model for failure rate prediction


Braglia Marcelloa, Castellano Davideb, Frosolini Marcoa*, Gabbrielli Robertoa,
Marrazzini Leonardoa, Padellini Lucac
a
Dipartimento di Ingegneria Civile e Industriale - Università di Pisa, Largo Lucio Lazzarino 2, 56126 Pisa, Italy
b
Dipartimento di Ingegneria Chimica, dei Materiali e della Produzione Industriale - Università di Napoli Federico II, Piazzale Tecchio 80, 80125 Napoli,Italy
c
Dipartimento di Ingegneria dell'Informazione - Università di Pisa, Via Girolamo Caruso 16, 56122 Pisa, Italy

* Corresponding author. Tel.: +39 050 2218139; fax: +39 050 2218140. E-mail address: [email protected]

Abstract

In the Industry 4.0 era, Preventive Maintenance (PM) is still an attractive solution to prevent breakdowns and failures and to reduce
maintenance and failure costs. A PM program is part of both the Total Productive Maintenance (TPM) philosophy and the Reliability Centered
Maintenance (RCM) process. A prerequisite to carry out effective PM activities is the availability of a reliable estimate of the equipment failure
rate. Assessing it may be a hard task, as it requires analysing a large set of maintenance data, which includes both quantitative and qualitative
variables. To this aim, it is possible to exploit advanced data analysis techniques that permit extracting information and knowledge from big
datasets. This paper presents an ensemble-learning model to estimate the failure rate of equipment subject to different operating conditions. At
the same time, the method permits to identify the most important working parameters affecting the failure rate. An industrial application is
considered to show the potentialities and the effectiveness of the proposed method. In particular, a sample of 143 centrifugal pumps installed in
an oil refinery plant is analysed.

© 2020 The Authors. Published by Elsevier B.V.


This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the scientifi committee of the International Conference on Industry 4.0 and Smart Manufacturing.

Keywords: Industry 4.0; Ensemble learning; Preventive maintenance; Failure rate

1. Introduction program is part of both the Total Productive Maintenance


(TPM) philosophy and the Reliability Centred Maintenance
The Industry 4.0 paradigm and the related technologies are (RCM) process [6]. The primary objective of PM is to prevent
currently under the spotlight of both academicians and failures before they occur. Comprehensive PM programs
practitioners [1]. The motivation is that the manufacturing schedule repairs, lubrication, adjustments and machine
industry is facing the so-called “data-driven revolution”. The rebuilds for all critical plant machinery. In order to support
digitalization process that converts traditional factories into the work of maintenance experts, good PM practices require
smart factories has given rise to an enormous growth of data that all available data regarding failures is recorded into a
production [3]. Smart organizations must challenge to record well-organized database. All PM programs assume that
and manage such “big data”, to extract meaningful machines will degrade within a time frame specific of their
information from them by means of appropriate analytical peculiar working conditions. Clearly, a reliable evaluation of
techniques and tools. the equipment failure rate permits to carry out effective PM
In this Industry 4.0 era, preventive maintenance (PM) is a programs. On the other hand, the mode of operation and
still an attractive solution to prevent breakdowns and failures system- or plant-specific variables directly affect the
and to reduce maintenance and failure costs [4]. A PM operating life of each equipment. This means that the failure
rate depends on several factors whose identification and
2351-9789 © 2020 The Authors. Published by Elsevier
B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the scientifi committee of the International Conference on Industry 4.0 and Smart Manufacturing.
10.1016/j.promfg.2020.02.022
4 Braglia Marcello et al. / Procedia Manufacturing 42 (2020) 41–48
2
quantification is a challenging task [7]. present low correlation [13]. This happens because each one
As observed by [8], failure rates can be estimated from of the different algorithms may better contribute with its own
empirical data and formulas available in handbooks, strengths. Typically, in predictive modelling and in data
manufacturers’ data, industry standards, MIL-standards, and analytics in general, a single model is used on a given data
so on. If quantitative data are not available, experts can make sample. The dataset can be large and rich, without missing
use of their own experience to assess values. However, all values and errors. However, the model often presents biases,
these methods are characterized by many limits. Failure rate high variability and inaccuracies that affect the overall
formulas provided by handbooks or regulations are complex reliability of its conclusions. Often, this is because some
and valid only under very special operating conditions, which algorithms, though extremely powerful under given
are seldom satisfied in practice. Suggested failure rates by hypotheses and circumstances, suffer from the presence of
manufacturers may be conservative, which may lead to previously unseen examples within the studied datasets. The
excessive maintenance. Probabilistic approaches to estimating same effect is introduced by outliers and rare values. On the
the time to failure (e.g., proportional hazards model) appear to contrary, an ensemble investigates the dataset using all its
be too complex to be applied in most practical situations, and constituent algorithms, allowing each one of them to support
the required statistical tools and competencies are hardly the others in the case of dubious outcomes. A famous example
available in industry. is represented by the random forest of trees [15]. This
To overcome these issues, some researchers have proposed algorithm builds numerous decision trees while training and
practical non-parametric approaches for failure rate analysis. gives, as the output, a single class that is the mode of the
[9] used the classification and regression tree (CART) corresponding classes of the individual trees. Doing so, the
methodology. [10] developed an approach based on artificial method avoids the known overfitting behaviour of the original
neural networks (ANNs). More recently, [11] presented a trees.
multivariate data classification technique. Building an ensemble requires building different models
The present work belongs to the same stream of research, and combining their estimates. The building stage may be
but it exploits the advances in big data analytics to tackle two accomplished, for instance, changing weights, data values,
important tasks required to carry out an effective PM program: control parameters, variable subsets, or partitions of the input
(i) estimating the failure rate of equipment subject to different datasets. Bagging, short for Bootstrap Aggregating model,
operating conditions and (ii) identifying the most important [13] uses the training dataset to build different decision trees
working parameters affecting the failure rate. These activities and, finally, takes the majority vote or the average of their
can evidently take advantage of analytic techniques able to estimates. Random Forests [15] add stochastic components to
treat stream of “big data” as they require analysing a large set increase diversity among the trees being combined. AdaBoost
of maintenance data collected into a computerized [16] builds models iteratively, changing case weights, and
maintenance management system (CMMS), which includes uses the weighted sum of the estimates. Good ensembles
both quantitative and qualitative variables. should present both accuracy and simplicity. However, to
In order to accomplish these tasks, this paper proposes an reach higher accuracy, models tend to become extremely
ensemble-learning model that combines prediction results complex. While doing so, they are exposed to the risk of
from multiple algorithms. Although ensemble-learning overfitting and poor generalization capabilities.
methods have been used to approach a variety of different Regularization techniques have been introduced to reduce the
problems, and the associate literature is rather ample [12], we complexity of the model fitting procedures and have shown
have not found any contribution focused on using this over time to allow for extremely effective ensemble models
technique to estimate the failure rate of equipments subject to [17].
different operating conditions and to discriminate the working
parameters affecting the failure rate. Hence, the present work 3. Case study
aims to fill this gap. Note that we do attempt to develop
neither a mathematical relation between failure rate and In this section, a case study is presented in order to
operating conditions nor lifetime distributions. Rather, our illustrate the creation of the ensemble model proposed in this
intention is to provide an easy-to-use approach that can be work which is used to estimate the failure rate of a set of
implemented in practice by means of a free software. centrifugal pumps subject to different operating conditions
Specifically, an industrial application is considered to show and to identify the most important parameters affecting the
the potentialities and the effectiveness of the proposed failure rate.
method. In particular, a sample of 143 centrifugal pumps
installed in an oil refinery plant is analysed.
3.1. Brief overview of the refinery plant
2. Basic Concepts of Ensemble Modelling The industrial application concerns several centrifugal
pumps installed in an oil refinery plant. The plant performs the
Ensemble modelling refers to the use of multiple learning entire petrochemical cycle: crude oil supply, refinery process,
algorithms to obtain better predictive capabilities than those and distribution of finished products. The transformation
obtained from any of the basic, constituting learning methods. process adopts a “medium-high” conversion type, operating
Ensemble model works better when the original models through the adoption of thermal process. Fig. 1 provides the
oil refinery-processing scheme.
Braglia Marcello et al. / Procedia Manufacturing 42 (2020) 41– 4

The refinery is characterized by processing and service identified. While the name of some variables is self-
systems which occupy a surface of nearly 650,000 m2 with explanatory, some others require a brief description. “Plant
3,000 km of piping. The plant has a storage capacity of more type” defines the part of the refinery plant where the pump
than 1,500,000 m3, an annual production capability of about operates. “Soot” identifies the solid carbon-based particles
390,000 tons of oil, and an oil tanker receiving capability up present in the fluid, which typically have disruptive actions on
to 400,000 tons displacement. A closed-loop water system the seals. Finally, we clustered the seal type into four
capable of delivering 700 m3/h of water and a fire system able categories: single-seal (S), dual-seal (DS), lip-seal (SL), and
to bring in up to 3,000 m3/h of seawater are included in the tandem-seal (T).
plant. An integrated combined cycle plant assures the The net operating time (NOT) included in Table 1 is
necessary power supply capable of about 280 MW, which defined as the cumulated functioning time, from the start up
operates by burning a synthesis gas obtained in the refining until the last observed failure. In formulas, the net operating
cycle.
time of the jth pump, , can be expressed as follows:
After a primary distillation phase, which is known as
topping, the materials flow into two separate distillation units:
(1
)
 The atmospheric distillation unit (i.e., unifining) treats light
fractions by separating petrol from liquefied petroleum gas
(LPG). Then, petrol undergoes two further transformation where is the number of observed failures for the jth pump,
phases (i.e., platforming and isomerization) required to is the failure time of failure i on pump j, and
increase the octane number and to eliminate aromatic are the opening time and the closing time, respectively, of the
compounds. maintenance order regarding failure i on pump j.
 The vacuum unit treats the middle distillation fraction The exact values of were not available in the database.
(mainly kerosene) and feeds a desulphurization process. Hence, we replaced with in the evaluation of the net
operating time. This approximation can be justified according
Finally, the heavy fraction and all distillation residuals are to the following argument. Reactive maintenance is the first
processed by means of thermal cracking and visbreaking. option, while condition maintenance applies only to the pumps
These treatments are required to improve oil conversion rate that do not operate in active redundancy. However, even in
and to increase the overall production of lighter products. this case, when an operating threshold limit is trespassed (e.g.,
excessive vibration and/or leakage) and a maintenance order is
issued, the remaining useful life of the pump can be
considered nearly null. Hence, the approximation
can be assumed reasonable, and the MTBF is finally given by:

where the net operating time is obtained according to Eq. (1).

3.3. Ensemble model for failures analysis

Fig. 1. Oil refinery processing scheme. To apply ensemble modelling to the pumps’ dataset, the
stacked generalization and the bagging methods were used.
3.2. Structure One of the most interesting aspects of stacking is that it may
be used to combine models of different types. The most
The analysis considered 143 centrifugal pumps installed in important goal of stacking is, possibly, the reduction of bias.
the oil refinery plant described in Section 2.1. These pumps To begin with, the algorithm splits the training set into two
were monitored over a period of 18 months. In this period, separate subsets and trains several learners on the first subset.
operating time, failures and maintenance tasks were recorded The remaining data are used to test and validate the model
in a standard computerized maintenance management system but, instead of using a monolithic approach in which the best
(CMMS). Table 1 shows a sample of the records included into learner becomes the winner, the outcomes of all the models
the CMMS. Within the database, ten potential predictors were

Table 1. Example of data available in the CMMS.


Potential predictors Failure modes and MTBF
Total
Nominal Nominal Nominal Fluid Cinematic number Net
Plant Service/ Seal capacity head power temp. viscosity Density Fluid Irregular Mechanical Electrical of operating MTBF
3 3
Code type fluid type Soot [m /h] [m] [kW] [°C] [cSt] [kg/m ] leakages working Vibrations failures failures failures time [h]
P1001 TOP1000 Crude oil S No 530 237 394.5 18 42 870 4 2 0 1 3 10 13127 1313
P1002 TOP1000 Residuum T Yes 231 173.5 127 360 1.17 801 4 0 1 0 0 5 12283 2457
P1003 TOP1000 Petroleum S No 63 117 22.5 187 0.46 670 8 0 0 0 0 8 15918 1990
… … … … … … … … … … … … … … … … … … …
Condensed
P1010 TOP1000 S No 7.7 180 27.8 38 0.69 992.6 2 2 0 0 0 4 11560 2890
water
Preheated
P1011 TOP1000 S No 90 59 16 156 0.88 760 12 0 0 1 0 13 27817 2139
gasoil
P1012 TOP1000 Acid water S No 28 190 43.4 16 1.11 998.5 1 0 1 0 2 4 11067 2767
4 Braglia Marcello et al. / Procedia Manufacturing 42 (2020) 41–48
4 … … … … … … … … … … … … … … … … … … …
Braglia Marcello et al. / Procedia Manufacturing 42 (2020) 41– 4

are mixed, possibly in a nonlinear fashion, to get the most of


all the algorithms. The weight used to mix the various models
are obtained by means of a supplementary algorithm (usually
a logistic regression process), that compares the outcomes
with the inputs of all the used models. Therefore, the key
point lies in the fact that all the models are compared and
judged on subsets of data that were not used to create them.
Bagging is primarily used to decrease the variance of the
prediction/classification. It works generating supplementary
data for training the models starting from the original dataset.
To do so, it opportunely combines data with repetitions to
produce multisets of the same cardinality as the original data.
Obviously, this method cannot improve the predictive
capability just increasing the size of the training set, but it
strongly decreases the variance, narrowly tuning the
prediction itself.
The two models have been built by means of the
educational licensed release of the RapidMiner Studio 9®
software package. It is a data science software platform that
Figure 2. The CRISP-DM process.
provides an integrated environment for data preparation,
machine learning and predictive analytics. It also supports all
steps of the learning process (Fig. 2), following the cross-
industry standard process for data mining (CRISP-DM),
including data preparation, results visualization, model
validation and optimization within an easy and user-friendly
graphic interface.
To begin with, the original data have been saved as a
comma separated value (CSV) file and have been imported
into RapidMiner® workbench. The case study and the
corresponding characteristics and aspects were all well-known
from previous research activities and therefore, with respect to
steps 1 and 2 of the CRISP-DM process, no additional
information was necessary.
Referring to step 3, namely the data preparation, the dataset
has been cleansed and some attributes (columns), that resulted
incomplete or clearly wrong, have been removed from the
dataset. The only significant modification to the original
dataset was in the computation of the MTBF, expressed in
hours. Indeed, some pumps had no failures over relatively
long time periods. In such cases, it was decided to use a high
numerical value to correctly represent the infinity. On the
other hand, some pumps presented no failure at all, but their Figure 3. Setting roles.
operating time was very short and, therefore, there was a great
deal of uncertainty on their MTBF value. These pumps have Successively, using the “Select Attributes” operator, a
been characterized by a MTBF equal to -1 and have been later subset of the available columns has been chosen for the
removed from the analysis. analysis. In particular, the fields “Failures”, ”Failure Rate”
As a result, it was possible to pass immediately to step 4 and “Time” have been removed due to the fact that the
and start to model the whole ensemble process. First, it was “MTBF” attribute is clearly given by their combination (Fig.
necessary to change the attribute role of the column “Code” 4).
from “regular” to “id”, meaning that this field should be As stated, a filter has been introduced to remove those
considered only for identification purposes, without using it examples that showed a “MTBF” equal to -1. Thus, the dataset
for classification/prediction. Then, the attribute “MTBF”, has was reduced to 130 records, with 2 special and 11 regular
been used as the “label” or, in other words, the goal of the attributes. Additionally, a “Discretize” operator has been
analysis. These two activities have been performed by means adopted to arrange the label field (MTBF) into homogeneous
of a “Set Role” operator, as shown in Fig. 3. RapidMiner® ranges. Indeed, this operator discretizes the selected numerical
changes the background colours of the corresponding columns attributes to nominal attributes. All numerical values are
to visually communicate their new state. mapped to the defined classes according to the values
specified by the user. Both for comparing purposes and
because they proved to be representative and coherent, the
ranges are maintained similar to those already used in the
previous analyses. Briefly, there are 5 different ranges, as
summarised in Table 2.
4 Braglia Marcello et al. / Procedia Manufacturing 42 (2020) 41–48
6
combinations of the selected values of the parameters. Finally,
it gives as an output the optimal parameters and, at the same
time, it sets and runs the model. Within the nested operator a
“Split validation” algorithm is used to separate the available
examples into two different datasets, respectively the training
and the testing sets. Instead of defining the splitting ratio as a
static value, it is passed as a parameter to optimization
algorithm. In particular, it varies in the interval 50-80%.
These datasets are then passed to the actual learner. As
shown in Fig. 5, the learning (training) and validation (testing)
structures, within the “Split validation” operator, are
constituted respectively by the “Stacking” algorithm, that
returns the optimised model, and a testing structure that
Figure 4. Attribute selection. applies it to the testing dataset and measures the overall
Table 2. Example of data available in the CMMS.
performance.
The “Stacking” operator is also a nested operator. It
contains two separated sub-processes (Fig. 6), namely the
MTBF range Label “Base learner” and the “Stacking model learner”. The “Base
learner” is the operator where all the simple learning
algorithms are trained and evaluated with respect to their
Up to 2500 hours VERY LOW performance. Then, the “Stacking model learner” decides how
2500 to 4500 hours LOW to mix their characteristics, strengths and weaknesses to build
4500 to 9000 hours MODERATE
9000 to 16000 hours HIGH the final ensemble model.
Above 16000 hours VERY HIGH

Figure 5. Learning and validation structures.

Figure 6. Stacking model structure.

At this point, the dataset is partitioned into two different The “Base learner” includes three algorithms: k-nearest
samples, to leave the 40% of data unseen by the learner and neighbours (k-NN), naïve Bayes, and random forest. The
available for a later validation. To this aim, the well-known “Stacking model learner” makes use of decision trees. All
stratified sampling algorithm is used. It builds random subsets these are widely known methods in machine learning [18]:
and ensures that the class distribution in the subsets is the  the k-NN algorithm is a non-parametric method for
same as in the whole dataset. classification, whose output is a class membership;
Following, the ensemble model was built as follows: first,  naïve Bayes is a probabilistic model-based algorithm that
an “Optimise parameters (Grid)” was introduced to perform relies on Bayes’ theorem with independence assumptions
multiple optimizations on the various algorithms (these between the features;
settings are reported in detail in Table 2). This is a nested
operator that executes all given sub-processes for all
Braglia Marcello et al. / Procedia Manufacturing 42 (2020) 41– 4

 random forests are an ensemble learning method which Table 3. Optimization parameters.
builds a multitude of decision trees at training time and
gives as output the class that is (typically) the mode of the
classes of the individual trees; Operator Parameter Range
 decision tree is non-parametric method used to go from Validation Split ratio 0.5 to 0.8
Random forest Number of trees 10 to 100
observations about an item to conclusions about the item’s Decision Tree Criterion Gain ratio, Information gain, Gini
target value. index, Accuracy
Decision Tree Min. leaf size 2 to 6
Decision Tree Min. size for split 4 to 12
Finally, the ensemble model is used to perform a validation
cycle.
This activity involves using those samples (40% of the Table 4. Optimal values.
entire dataset) that had been split in a previous step. The
“Optimise parameter” operator is used to evaluate the optimal Operator Parameter Range
configuration acting on the parameters reported in Table 3. Validation Split ratio 0.79
Random forest Number of trees 40
Running the ensemble model yields the following results. Decision Tree Criterion Accuracy
To begin with, the optimal values for the above-mentioned Decision Tree Min. leaf size 4
parameters are summarised in Table 4. Decision Tree Min. size for split 8

Table 5. Ensemble model confusion matrix.


Accuracy: 96.15%
True True True True True
Class precision
VERY LOW LOW MODERATE HIGH VERY HIGH
Pred.
9 0 0 0 0 100%
VERY LOW
Pred.
0 5 0 0 0 100%
LOW
Pred.
0 0 5 0 1 83.33%
MODERATE
Pred.
0 0 0 3 0 100%
HIGH
Pred.
0 0 0 0 3 100%
VERY HIGH
Class recall 100% 100% 100% 100% 75%

Figure 7. Random forest trees with uncertainty on a LOW label.


4 Braglia Marcello et al. / Procedia Manufacturing 42 (2020) 41–48
8
Applying such parameters, the ensemble model reached a 4. Conclusions
significant accuracy value of 96.15%. The corresponding
confusion matrix is given in Table 5. It clearly shows that only
The paper presented an innovative framework based on
one example over 26 within the testing dataset has been
ensemble learning model that, combining results from
incorrectly classified, whereas all the other results are exact.
multiple algorithms, makes it possible to classify items in
With the aim of showing the improved capabilities of the
ensemble model two trees evaluated by the random forest terms of the MTBF. Specifically, the framework is an ex-post
algorithm have been reported in Fig. 7, where it is evident that analysis that, starting from “big data” recorded in a CMMS,
the algorithm shows some uncertainty with respect to some which includes both quantitative and qualitative variables,
examples belonging the label LOW. tries to give a good estimation of the MTBF of installed
On the contrary, the ensemble model is able to overcome equipments. The novel approach is characterized by several
this doubt and show a very good capability of discriminating peculiar advantages: (i) it exploits the advances in big data
and correctly classifying the majority of the examples (Fig. 8). analytics to estimate the failure rate of equipments subject to
Even more interestingly, the system evaluates the prediction different operating conditions; (ii) it permits to discriminate
and the confidence for each row, giving the analyst a very the working parameters affecting the failure rate; and (iii) it
powerful tool to investigate the outcomes. As an example, in obtains better predictive capabilities than those obtained from
Fig. 9, some rows are reported along with the true MTBF any of the basic, constituting learning methods. Finally, the
value, the prediction and its confidence level. effectiveness and the usefulness of the novel analysis have
been demonstrated using an industrial case study about 143
centrifugal pumps installed in an oil refinery plant. As proven
by the results, the ensemble reached a significant accuracy
value of 96.15%, giving the analyst a very powerful tool to
investigate the true MTBF value. Clearly, due to the extreme
variability in the operating conditions, results are site specific,
unless they are used exclusively to identify the most critical
factors in the failure mechanisms of specific equipment. This
means that the model, though effective and able to provide
extremely good results, cannot be used “as-is” to estimate the
MTBF of the equipment installed in a different industrial
context. Indeed, it must be reconfigured every time, according
to the novel industrial setting in which it is applied. In brief, it
requires a new dataset, large and rich enough, possibly
without missing values and errors, that will be use during the
Figure 8. Correctly classified examples (by the ensemble model).
training stage. If this is the case, thanks to the large number of
available algorithms that can be used within the ensemble
model, a configuration will be certainly found that provides a
good degree of generalization and that is therefore able to
correctly estimate the MTBF.

Figure 9. Results of the analysis (example).


Braglia Marcello et al. / Procedia Manufacturing 42 (2020) 41– 4

References journals and conference proceedings. He is a member of


ANIMP (National Association on Industrial Plants) and AIDI
[1] Bokrantz J, Skoogh A, Berlin C, Stahre J. Maintenance in digitalised
(National Association of Academicians on Industrial Plants).
manufacturing: Delphi-based scenarios for 2030. International Journal of
Production Economics 2017; 191: 154–169.
[2] Mourtzis D, Fotia S, Boli N, Pittaro P. Product-service system (PSS) Davide Castellano graduated in 2010 in Management
complexity metrics within mass customization and Industry 4.0 Engineering at Università di Pisa. In 2015, he obtained a PhD
environment. The International Journal of Advanced Manufacturing in Mechanical Engineering at Università di Pisa with
Technology 2018; 97(1-4): 1–13. specialisation in Operations Management. In 2015, he was a
[3] Bumblauskas D, Gemmill D, Igou A, Anzengruber J. Smart Research Fellow at Università di Pisa. At present, he is a
Maintenance Decision Support Systems (SMDSS) based on corporate big
data analytics. Expert Systems with Applications 2017; 90: 303–317.
Research Fellow at Università degli Studi di Napoli “Federico
[4] Duan C, Deng C, Wang B. Optimal multi-level condition-based II”. His research activity mainly concerns maintenance,
maintenance policy for multi-unit systems under economic dependence. reliability, production management, logistics, and inventory
The International Journal of Advanced Manufacturing Technology 2017; management. He is the author of more than 30 technical
91(9-12): 4299-4312. papers published in international journals and conference
[5] Zandieh M, Joreir-Ahmadi MN, Fadaei-Rafsanjani A. Buffer allocation proceedings.
problem and preventive maintenance planning in non-homogenous
unreliable production lines. The International Journal of Advanced
Manufacturing Technology 2017; 91(5-8): 2581-2593. Marco Frosolini graduated in Mechanical Engineering at the
[6] Agustiady TK, Cudney EA. Total Productive Maintenance: Strategies Università di Pisa and obtained a PhD in Mechanical
and Implementation Guide. Boca Raton, FL: CRC Press; 2015. Engineering in 2005. He is currently Associate Professor in
[7] O’Connor PDT, Kleyner A. Practical Reliability Engineering (5th ed.). Industrial Plants within the same University. His research
New York, NY: John Wiley & Sons; 2012. activities mainly concern equipment maintenance and
[8] Shalev DM, Tiran J. Condition-based fault tree analysis (CBFTA): A
reliability, production planning, logistics and project
new method for improved fault tree analysis (FTA), reliability and safety
calculations. Reliability Engineering and System Safety 2007; 92(9):
management. He is also interested in industrial information
1231–1241. systems and Industry 4.0. He is author of more than 30
[9] Bevilacqua M, Braglia M, Montanari R. The classification and technical papers, published in national and international
regression tree approach to pump failure rate analysis. Reliability journals and conference proceedings. He is a member of AIDI
Engineering and System Safety 2003; 79(1): 59–67. (National Association of Academicians on Industrial Plants).
[10] Bevilacqua M, Braglia M, Frosolini M, Montanari R. Failure rate
prediction with artificial neural networks. Journal of Quality in
Maintenance Engineering 2005; 11(3): 279–294.
Roberto Gabbrielli graduated with honours in Mechanical
[11] Braglia M, Carmignani G, Frosolini M, Zammori F. Data classification Engineering, specializing in Energy, at the University of Pisa
and MTBF prediction with a multivariate analysis approach. Reliability (Italy). He has a PhD in Energy Power Systems and is
Engineering and System Safety 2012; 97(1): 27–35. Associate Professor in Industrial Systems Engineering at the
[12] Sagi O, Rokach L. Ensemble learning: A survey. Wiley Interdisciplinary Department of Civil and Industrial Engineering of the
Reviews: Data Mining and Knowledge Discovery 2018; 8(4): e1249. University of Pisa (Italy). His current research concerns
[13] Breiman L. Bagging predictors. Machine Learning 1996; 26(2), 123–
production planning and control, development of decision
140.
[14] Elder JF. A review of Machine Learning, Neural and Statistical support systems for industrial investments, occupational
Classification. Journal of The American Statistical Association 1996; safety, energy saving, energy storage and the reduction of the
91(433): 436–437. environmental impact in industrial systems. He authored more
[15] Breiman L. Random Forests, random features. Berkeley; 1999. than 25 papers published in international scientific journals.
[16] Freund Y, Shapire RE. Experiments with a New Boosting Algorithm. In
Machine Learning: Proceedings of the Thirteenth International Leonardo Marrazzini graduated (110/110) in 2016 in
Conference; 1996.
[17] Tibshirani R. Regression shrinkage and selection via the lasso. Journal of
Mechanical Engineering at the University of Pisa. In the same
the Royal Statistic Society B. 1996; 58(1): 267–288. year, he began his PhD studies in Industrial Plants within the
[18] Ramasubramanian K, Singh A. Machine Learning Using R. Apress, New same university. His research activities mainly concern the
York, NY; 2016. adaptation of Lean Manufacturing principles to engineering-
to-order (ETO) production environments. His research
focuses on models and techniques to support various company
Authors Biography operations.

Marcello Braglia graduated (with distinction) in 1988 in Luca Padellini graduated in 2018 in Mechanical Engineering
Electronic Engineering at Politecnico di Milano. Since 1995, at the University of Pisa. In the same year, he began his PhD
he has been a Researcher in Mechanical Technology and studies in Smart Industry within the same University. His
Production Systems at the Università degli Studi di Brescia. research activities mainly concern the digitalization of the
Since 1998, he has been employed as a researcher and since supply chain in the Tuscan Fashion district. His research
2002, as a Full Professor, in Industrial Plants at the Università focuses on models and techniques able to improve the
di Pisa. His research activities mainly concern maintenance, communication and the transfer of information between the
reliability, production planning, lean production, logistics and members of the supply chain.
statistical quality control. He is the author of about 180
technical papers published in national and international

You might also like