0% found this document useful (0 votes)
26 views40 pages

Quality 40 An Evolution of Six Sigma DMAIC

Uploaded by

Diego Ferrario
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views40 pages

Quality 40 An Evolution of Six Sigma DMAIC

Uploaded by

Diego Ferrario
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/360344742

Quality 4.0 – an evolution of Six Sigma DMAIC

Article in International Journal of Lean Six Sigma · May 2022


DOI: 10.1108/IJLSS-05-2021-0091

CITATIONS READS

25 1,115

5 authors, including:

Carlos A. Escobar Daniela Macias


Harvard University Tecnológico de Monterrey
45 PUBLICATIONS 1,232 CITATIONS 8 PUBLICATIONS 101 CITATIONS

SEE PROFILE SEE PROFILE

Megan Mcgovern Marcela Hernández de Menéndez


General Motors Company Tecnológico de Monterrey
38 PUBLICATIONS 560 CITATIONS 20 PUBLICATIONS 1,274 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Carlos A. Escobar on 16 November 2023.

The user has requested enhancement of the downloaded file.


The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2040-4166.htm

Quality 4.0 – an evolution of Evolution of


Six Sigma
Six Sigma DMAIC DMAIC
Carlos Alberto Escobar
Department of Global Research and Development, General Motors Company,
Warren, Michigan, USA
Daniela Macias Received 11 May 2021
Revised 15 September 2021
Department of Graduate Studies of the School of Engineering and Sciences, 23 December 2021
Tecnologico de Monterrey, Monterrey, México Accepted 26 January 2022

Megan McGovern
Department of Global Research and Development, General Motors Company,
Warren, Michigan, USA, and
Marcela Hernandez-de-Menendez and Ruben Morales-Menendez
Department of Engineering and Sciences, Tecnologico de Monterrey,
Monterrey, México

Abstract
Purpose – Manufacturing companies can competitively be recognized among the most advanced and
influential companies in the world by successfully implementing Quality 4.0. However, its successful
implementation poses one of the most relevant challenges to the Industry 4.0. According to recent
surveys, 80%–87% of data science projects never make it to production. Regardless of the low
deployment success rate, more than 75% of investors are maintaining or increasing their investments
in artificial intelligence (AI). To help quality decision-makers improve the current situation, this paper
aims to review Process Monitoring for Quality (PMQ), a Quality 4.0 initiative, along with its practical
and managerial implications. Furthermore, a real case study is presented to demonstrate its
application.
Design/methodology/approach – The proposed Quality 4.0 initiative improves conventional
quality control methods by monitoring a process and detecting defective items in real time. Defect
detection is formulated as a binary classification problem. Using the same path of Six Sigma define,
measure, analyze, improve, control, Quality 4.0-based innovation is guided by Identify, Acsensorize,
Discover, Learn, Predict, Redesign and Relearn (IADLPR2) – an ad hoc seven-step problem-solving
approach.
Findings – The IADLPR2 approach has the ability to identify and solve engineering intractable problems
using AI. This is especially intriguing because numerous quality-driven manufacturing decision-makers
consistently cite difficulties in developing a business vision for this technology.
Practical implications – From the proposed method, quality-driven decision-makers will learn how to
launch a Quality 4.0 initiative, while quality-driven engineers will learn how to systematically solve
intractable problems through AI.

© Carlos Alberto Escobar, Daniela Macias, Megan McGovern, Marcela Hernandez-de-Menendez and
Ruben Morales-Menendez.
International Journal of Lean Six
The authors would like to thank Enago (www.enago.com) for the English language review. Sigma
The authors thank Tecnologico de Monterrey and General Motors for the resources provided for Emerald Publishing Limited
2040-4166
the development of this paper. DOI 10.1108/IJLSS-05-2021-0091
IJLSS Originality/value – An anthology of the own projects enables the presentation of a comprehensive Quality
4.0 initiative and reports the approach’s first case study IADLPR2. Each of the steps is used to solve a real
General Motors’ case study.
Keywords Quality control, Smart manufacturing, Artificial intelligence, Big data
Paper type Research paper

1. Introduction
“Industrial big data (IBD), Industrial Internet of Things (IIoT) and artificial intelligence (AI)
are advancing the new era of manufacturing” Smart manufacturing (Kusiak, 2018). The
application of these technologies to monitor, control and improve manufacturing quality has
initiated a new era of quality – Quality 4.0 (Radziwill, 2018). Manufacturing companies can
competitively be recognized among the most advanced and influential companies in the
world by successfully implementing Quality 4.0 (Escobar et al., 2021a, 2021b, 2021c, 2021d,
2021e; Javaid et al., 2021). A formal definition is (Escobar et al., 2021a, 2021b, 2021c, 2021d,
2021e):
Quality 4.0 is the fourth wave in the quality movement (1. Statistical quality control (SQC), 2. Total
quality management (TQM), 3. Six sigma, 4. Quality 4.0). This quality philosophy is based on the
statistical and managerial fundamentals of the previous philosophies. It leverages IBD, IIoT, and
AI to solve completely new sets of complex engineering problems. Quality 4.0 is based on a new
paradigm that enables smart decisions through empirical learning, empirical knowledge
discovery, and real-time data generation, collection, and analysis.
Modern quality control (QC) in the USA started a century ago when the statistician W.A.
Shewhart at Western Electric began focusing on controlling processes using statistical
quality control (SQC) methods, making quality relevant not only for finished products but
also for the processes that manufactured them. After several years, highly influenced by W.
E. Deming and J.M. Juran, Japanese manufacturers significantly increased their market
shares in the USA due to their superior quality. In response, many CEOs of major firms took
the initiative to provide personal leadership in the quality movement. The response not only
emphasized using SQC methods but also quality management approaches that
encompassed an entire organization and became known as total quality management
(TQM). A few years later, B. Smith developed Six Sigma, a reactive approach to eliminate
defects from all processes by identifying and removing the main sources of variation (Juran,
1995; Maguad, 2006; Mândru et al., 2011). The principles driving the Toyota Production
Systems were incorporated into Six Sigma to create Lean Six Sigma. This philosophy
identifies the nonvalue added activities for the end customer and uses this information to
remove waste from processes (Atmaca and Girenes, 2013). Each quality philosophy used a
scientific method in the form of problem-solving (Moen and Norman, 2010). Figure 1
demonstrates the evolution of the quality movement, the paradigm and the associated
problem-solving strategy of each philosophy.
Although conventional quality philosophies have raised manufacturing standards to
very high levels, they have plateaued and demonstrated limitations in addressing the
challenges posed to IBD (Villanova University, 2020). Furthermore, the factory of the future
is driven by manufacturing systems with rapidly increasing complexity, hyperdimensional
feature spaces and non-Gaussian pseudo-chaotic behaviors that contradict orthodox SQC
methods (Wuest et al., 2014). In light of these challenges, Quality 4.0 is the next natural step
in the evolution of quality. This paper presents a Quality 4.0 initiative supported by an
evolved problem-solving strategy.
Evolution of
Six Sigma
DMAIC

Figure 1.
Evolution of the
problem-solving
strategy in modern
quality movement

Recently, Quality 4.0 has become a popular concept, with various authors having their own
interpretations of what it means. According to (Radziwill, 2018), Quality 4.0 is the “[. . .]
pursuit for performance excellence” through digital transformation. The emphasis is on
technological innovations and connectivity, and describing their impact in the systems’
transformation using big data.
The overviews of Quality 4.0 provided by Ventura Carvalho et al. (2021) and Sader et al.
(2021) focus on quality management. The digitalization of TQM and the application of the
technologies available in Industry 4.0 are the main focus of this review. Their relationship is
presented to ascertain which technologies would improve quality practices.
In Zonnenshain and Kenett (2020), the authors described quality standards in the context
of the fourth industrial revolution (Industry 4.0). They included the big data concept of
detection and diagnosis of processes and failures, considering that the maturity level in
these initiatives requires improvement. A proposal integrating enabling technologies is
discussed, and a future strategy for companies to use when considering Quality 4.0
implementation is presented.
The concept Quality 4.0 is defined as an excellence performance by Zairi (2019), who
described the path for excellence in quality. It focused on organizational mindset shift
required to meet this standard as their foundations, limitations, past mistakes and
adaptability to achieve their goal.
As stated in Zairi (2018), Quality 4.0 has reached a high-value point, where the focus is
lost in all the characteristics describing this concept. The main issues are described,
including a deviation from the goal of quality implementation from products to services, the
limitations they encounter and an analysis of the final customer experience.
Using the previous reference, the shortcomings of Quality 4.0 highlight the need for
quality jobs transformation as standards continue to rise. In Shakeri (2020), a description of
the evolution toward Quality 4.0 is made. As companies develop higher-quality processes,
nonvalue-added activities such as auditing are being eliminated because they have negative
economic impacts. Jobs need to evolve, and the quality management professionals need to
work with technological processes, such as big data and machine learning (ML).
IJLSS In Escobar et al. (2021a, 2021b, 2021c, 2021d, 2021e), Quality 4.0 Green, Black and Master
Black Belt certifications are proposed. The authors established a curriculum for engineers
and quality professionals to perform and execute successful Quality 4.0 practices. The
flexibility of learning and adapting will produce professionals capable of implementing
Quality 4.0 in their organizations.
In Alzahrani et al. (2021), a review of Quality 4.0 was performed, providing a description
of the LNS Research Quality 4.0 framework, where conventional quality is compared against
new quality processes and technologies, such as big data and IoT. LNS Research is a
prestigious industry analyst firm specializing in manufacturing and industrial operations.
In (Javaid et al., 2021a) and (Godina and Matias, 2018), a general description of Quality 4.0 is
provided.
According to Escobar et al. (2021a, 2021b, 2021c, 2021d, 2021e), QC, statistics,
programming, optimization, ML and manufacturing are the areas of knowledge of Quality
4.0 (Figure 2). However, a problem-solving strategy is required to combine them in a
meaningful way.
Although the supporting technologies of Quality 4.0 have the potential to advance quality
standards, successful implementation is one of the most pressing challenges confronting
Industry 4.0. According to recent surveys, 80%–87% of data science projects never reach
production (Staff, 2019; Gartner Research, 2018). This is understandable given that current
quality benchmarks, conformance, productivity and innovation in industrial manufacturing
have set a very high standard, even for new technologies (Prem, 2019; Venkatesh and
Sumangala, 2018; Sharma et al., 2018). For example, from the perspective of manufacturing

Figure 2.
Areas of knowledge
of Quality 4.0
quality, most world-class companies have combined conventional QC methods to create high- Evolution of
conformance production environments. Process monitoring (PM) charts have been Six Sigma
established to improve the process capability index at the industrial benchmark of sigma level
four (Sreenivasulu Reddy et al., 2022; Sharma et al., 2018). This sigma level generates 6,210
DMAIC
defects per million of opportunities (DPMO) (Deniz and Çimen, 2018; Fursule et al., 2012).
Detecting these defects to move manufacturing processes to the next sigma level is one of the
primary intellectual challenges confronting AI (Escobar et al., 2020) (Figure 3).
As presented in the Quality 4.0 review, the authors described the enabling technologies,
value propositions, technical challenges, required skills and even the future of quality jobs.
However, a new problem-solving strategy to systematically drive AI-based innovation
has not been proposed. What is the next problem-solving strategy in this context? This is
the main research question addressed in this study.
Numerous studies are conducted on Quality 4.0 initiative based on PMQ (Abell et al.,
2017) are integrated with a seven-step problem-solving strategy that guides its
implementation: Identify, Acsensorize, Discover, Learn, Predict, Redesign and Relearn
(IADLPR2) (Escobar et al., 2021a, 2021b, 2021c, 2021d, 2021e) (Figure 4). The main concepts,
methods, technologies, supporting theories and implementation insights are briefly
discussed. The first case study in which these seven steps are used to solve a real complex
problem in General Motors is reported and discussed.
The rest of this study is organized as follows: the new technologies, AI, IIoT and IBD are
briefly reviewed in the context of manufacturing in Section 2. The classification principle is
described in Section 3. Modern applications of ML for PM and QC are reviewed in Section 4.
PMQ and its applications are described in Section 5. The seven-step problem-solving
strategy is described in Section 6. Then, a real case study is presented in Section 7
discussing practical and managerial implications. Finally, Section 8 presents the conclusions
of the study.

2. Artificial intelligence, industrial internet of things and industrial big data


2.1 Artificial intelligence
AI is transforming every industry. This technology provides an opportunity for quality-
driven decision-makers to distinguish themselves and take the lead in their respective
business segments. According to a simulation study, by 2030, roughly 70% of the

Figure 3.
Current conformance
rate across
manufacturing
companies. Detecting
the 0.011% on each
side is the current
challenge for AI
IJLSS

Figure 4.
Problem-solving
strategy for Quality
4.0

companies may have adopted at least one type of AI technology, potentially generating an
additional $13tn (Bughin et al., 2018) in economic growth. However, Brookings reported that
only 17% of senior decision-makers in the USA are familiar with AI (West and Allen, 2018).
Although they understand the potential of this technology, they lack a clear deployment
strategy.
The overall goal of AI is to create technologies that augment human intelligence or take
over the tedious, monotonous, risky, mortifying and dehumanizing jobs. These technologies
are programmed to autonomously handle situations without human intervention. ML,
robotics, computer vision, natural language processing and expert systems are the common
and thriving AI research areas (Pannu, 2015) (Figure 5).
ML (Murphy, 2012) is a branch of AI that gives machines the skills to learn from
examples without being explicitly programmed. Machine learning algorithm(s) (MLA) plays
a crucial role in the structure of AI, where they become smarter using big data through self-
learning without human intervention. MLA automatically learns patterns without assuming
a probabilistic distribution or a predefined model form. Simple algorithms are used in simple
applications, while more complex ones help to solve complicated AI problems. The ML
techniques are classified into three broad categories (Loukas, 2020):
(1) Supervised learning. All the observations in the data set are labeled, and the MLA
learns to predict the output using the input data. Supervised models are further
categorized into regression and classification methods.
(2) Unsupervised learning. All the observations in the data set are unlabeled, and the
MLA learns its inherent structure using the input data. Clustering, dimensionality
reduction and association methods are included in this category.
Evolution of
Six Sigma
DMAIC

Figure 5.
Areas of AI

(3) Reinforcement learning. This family of models consists of MLA, which uses the
estimated errors as rewards or penalties. The goal is to identify the best actions
that maximize the long-term reward. Trial-and-error search and delayed reward
are the most essential characteristics of this approach.

2.1.1 Desirable characteristics of machine learning projects. Selecting the right ML projects
drives the company’s success in the deployment of AI since many projects are ill-
conditioned from the start. A few rules of thumb should be followed when selecting the
initial projects (Ng, 2019; Google Cloud, 2021):
 Learning a simple concept with a lot of data is more likely to be feasible. An activity
that requires only a few number of seconds of mental processing.
 Quick wins are good starting points. A project that can yield values within 6–12 months.
 Complex problems with limited data are not suitable for initial candidate projects.
However, once the data science teams have gained traction, more complex problems
can be pursued.
 Projects should be relevant to the industry and should leverage the know-how of the
company.
 Projects should address the impossible vs straightforward as well as the ambiguous
vs specific trade-offs. They should not be impossible but should also not address
simple problems that can be solved using conventional methods (e.g. t-test).
IJLSS  The two-dimensional Lean Six Sigma approach can be used to evaluate the projects
in terms of their difficulty and value. The upper-left quadrant (high-payoff and low
difficulty) problems are preferred (Figure 6).

2.2 Industrial internet of things


According to Intel (Intel, 2021), there are five steps to follow in creating an accelerated value
for the employed data used (Figure 7).
Step 1: the organization can assess the potential impact by considering the possible
benefits that can be obtained from the project. A return on investment is included to
estimate findings without affecting overall performance. Resources and processes are
planned.
Step 2: different solutions and systems are considered for the data analysis, thus how the
data will be prepared, stored and monitored. The selection will be made based on the
company’s needs, from basic statistical models to complex data analysis.
Step 3: edge analytics considers the processing of the data close to the data source. This
enables prior data filtering, retaining only the essential information and uploading it to the
cloud. Predictive systems can be implemented using edge analytics right in the field,
preventing a delay in the problem response.

High Do first Do last


Payoff

Low Do second Avoid


Figure 6. Low High
Lean Six Sigma
project pick chart Difficulty

Figure 7.
Five steps to create
an accelerated value
for the data used
Step 4: the capability of existing solutions is considered. Trusted organizations must be Evolution of
selected to reduce long-term risks. Performance, security and scalability must be incorporated. Six Sigma
Step 5: continuous improvement will keep the company measuring its performance,
updating its capabilities and reaching its business goals. Sensorization of the processes
DMAIC
must be achieved as a previous step to perform a successful value for the data.
2.2.1 Sensorization as a driver of industry 4.0. Quality 4.0 is creating support for digital
transformation, including connectivity, use of AI and automation, with the use of emerging
technologies (Radziwill, 2020).
The IIoT has enabled smart sensors to become State-of-the-Art devices in various
applications. They can calibrate, configure and often repair themselves when required
(Schütze et al., 2018). Their stored data can be used to determine a process malfunction or
irregularity when they are implemented.
Smart sensors (or intelligent transducers) were defined by the Institute of Electrical and
Electronics Engineers as an “integration of analog or digital transducers, its associated
signal, a processing unit, and a communication interface for data transfer” (Lee, 2000).
Sensors can be added to an already functioning system or can be manufactured along with
its processing unit, creating a closed circuit with higher monitoring capabilities (Frank,
2013).
Industry 4.0 relies on sensor recordings to achieve its quality standards, using real-time
data to prevent production problems. Some common data are temperature, pressure,
displacement, force, torque, flow, strain and humidity. This requires measurement
techniques, such as piezoresistive, capacitive, optoelectronic, inductive and ultrasonic
(Frank, 2013).

2.3 Industrial big data


IBD refers to a big, diversified, time series data generated by manufacturing systems. IBD
leverages MLA and the IIoT (Boyes et al., 2018) to develop predictive systems in the
Industry 4.0 era. Audio, video, imaging and sensors are the most common devices used to
obtain the supporting data for intelligent maintenance (Cachada et al., 2018), quality
monitoring (Abell et al., 2017), smart energy (Lund et al., 2017) and supply chain (Wu et al.,
2016) systems. However, there are challenges in terms of V’s (Uddin and Gupta, 2014;
Lahiru, n.d.; Abell et al., 2017).
The V’s are described from a manufacturing perspective (Figure 8):
 Volume is the size of data being created.
 Velocity describes how fast the data need to be created, accessed and processed for
prediction. A Quality 4.0 initiative requires nearly real-time speed.
 Variety is defined as analyzing the different forms of data sources (videos, images,
audio or signals).
 Veracity is the trustworthiness of the data.
 Variability refers to the dynamic nature of manufacturing systems.

The transient and novel sources of variation introduce nonsteady data distribution into
manufacturing systems. In ML, the concept of drift considers the fact that the statistical
distributions of the classes that the model is attempting to predict change over time in
unexpected ways (Webb et al., 2017; Wang and Abraham, 2015):
 Visualization means to how the data mining results are summarized and used by
management for their decision-making.
IJLSS
Volume

Value Velocity

Void Variety

Industrial
big data

Vigilance Veracity

Verificaon Variability

Visualizaon
Figure 8.
The 10 V’s of IBD

 Verification is how the models are validated (White, 2000). Typically, pilot runs
would provide unbiased information.
 Vigilance refers to the strategy that must be developed to address the variability
attribute; a learning, relearning and forgetting scheme.
 Void is the propensity of plant dynamics to have many empty manufacturing-
derived data records.
 Value corresponds to the essence of understanding the value of the Quality 4.0
project before assigning it to the data science team.

3. The classification principle


The principle of classification – or pattern recognition – is a scientific discipline that studies
the automatic classification of items into several categories (class) (Theodoridis and
Koutroumbas, 2001). Classification is suitable for modeling nonstochastic or deterministic
relationships between inputs (features) and outputs (classes). However, it should not be used
when two items with identical inputs can easily have different outcomes. For the latter, a
probabilistic study is better (Harrell, 2020).
Class characterization is the key to successfully classifying patterns, i.e. enabling the
classifier to define the classification boundaries properly. This information must be
captured by the features included in the training data. A feature is an observable variable of Evolution of
the phenomenon being investigated. Creating features with discriminative information that Six Sigma
are independent of one another is a crucial step in pattern recognition modeling (Bishop,
2006). After the features have been created and selected, the MLA are used to learn the class
DMAIC
patterns that usually exist in hyperdimensional spaces. A classifier is a discrete-valued
function that uses the features to assign a class label to a particular item (Raschka and
Mirjalili, 2017). Figure 9 demonstrates some patterns in a two-dimensional space.
Figure 10 provides a high-level overview of the data roadmap. First, data science
leverages advanced domain knowledge to observe the process; generate relevant data
through connected devices such as cameras, microphones or sensors (IIoT). Then, features
are created from the raw process data and the relevant ones are selected to generate the
training set. The process of combining engineering domain knowledge with data science to
generate features is called feature engineering (Veeramachaneni et al., 2014). Finally, a MLA
is applied to generate a classifier.

3.1 Machine learning algorithms


MLA are programs that learn from data and improve as more data are added without
human intervention. The learning task includes learning the function that maps the input to
the targets. There is no a priori distinction between MLA (Wolpert, 1996). A good learning
strategy should include diverse and complementary MLA. Dozens of more algorithms could
be listed; some can be quite effective in specific scenarios.
To address the coverage optimization problem (Ho, 2000), Big Models (BM) is founded on
eight MLA: Logistic Regression (LR) (Lee et al., 2006), Support Vector Machine (SVM) (Ray,
2017), including the Radial Basis Function (RBF) kernels (Valente Klaine et al., 2017), Naive
Bayes (Wu et al., 2008), K Nearest Neighbors (KNN) (Imandoust and Bolandraftar, 2013),

Figure 9.
Illustration of binary
patterns
IJLSS

Figure 10.
Data road map
Artificial Neural Network (ANN) (Demuth et al., 2014), Random Forest (RF) (Breiman, 2001) Evolution of
and Random UnderSampling Boosting (RUSBoost) (Seiffert et al., 2009) algorithms (Table 1). Six Sigma
The proposed list includes margin- and probabilistic-based, linear and nonlinear
(Demuth et al., 2014), parametric and nonparametric (Murphy, 2012), stable and unstable
DMAIC
(Davidson, 2004) and generative and discriminative (Permission, 2005) algorithms[1]. This
diverse list of MLA can solve a wide range of classification problems and enable the creating
Multiple Classifier Systems (MCS) to the optimize prediction (Clemen, 1989).

3.2 Training data


To successfully develop AI solutions, the right training data are required. In ML, training
data are the data used to train a MLA. The training data consist of predictive features
(inputs) and labels (targets). Better features result in faster training and more accurate
predictions. Human intelligence is required to label the data. In a binary classification of
quality problem, labels are defined as follows:
(
1 if ith item is defective ðþÞ
Labeli ¼ (1)
0 if ith item is good ðÞ

Process-derived data are observational data (Montgomery, 2017) used by the MLA to learn
intrinsic quality patterns. However, ML models should not be used for theory building
(Wasserman, 2013; Shmueli, 2010). Instead, extracted information revealed patterns,
associations and correlations should be used for hypothesis generation and guide
randomized experiments. Montgomery (2014) conveyed this concept:
From basic statistics, correlation does not imply causality. However, with enough data and strong
correlations that hold up over time, useful business decisions can often be made. Data mining
activities should produce at least a set of hypotheses that can potentially be tested using more
rigorous methods, such as designed experiments.

3.3 Learning the pattern


Perhaps one of the primary drivers of ML applications is in their ability to learn the pattern
of a problem that conventional approaches, including human qualitative interpretation,
cannot. In this regard, data-driven ML techniques are very powerful as they can often
recognize nonobvious patterns. The model developer must guard against some common,

MLA Linear Nonlinear Parametric Nonparametric Stable Unstable Gen Dis

SVM    
LR    
NB *   
KNN    
ANN  **  
SVM(RBF)     
RF    
RUSBoost    
Table 1.
Notes: *With numeric features. **with a set of parameters of fixed size. Gen: generative. Dis: Characteristics of the
discriminative MLA
IJLSS avoidable pitfalls and ensure that the model is valid, generalized and unbiased over the
appropriate domain. In this section, some concepts regarding algorithm selection, data
fitting and model validation will be discussed.
ML algorithms use empirical or data-driven, learning, in which there are three basic
types:
(1) supervised learning;
(2) unsupervised learning; and
(3) reinforcement learning.

In supervised learning, the MLA is trained with labeled data. In unsupervised learning, the
MLA makes inferences on unlabeled data and is often used to find “hidden” patterns in a
data set. Reinforcement learning uses penalties or rewards to train the model. All MLA come
with inherent inductive or learning biases; therefore, some algorithms will be more suitable
for particular data sets and scenarios than others. There is no universally “best” MLA
option. As such, many algorithms should be explored to effectively and efficiently learn the
pattern of the specific problem. The “there is no free lunch” theorem encapsulates this
sentiment by realizing that the model developer must draw on their own knowledge and
experience of the data and problem context to select the appropriate MLA.
Once a model is created, it is important to quantify and moderate its bias and variance
prediction errors (Singh, 2018). Of course, there is a trade-off between minimizing these
errors, and it is not possible to minimize both. Bias error occurs when the learning algorithm
is performed under erroneous assumptions. A model exhibiting high bias will underfit the
data and omit relevant correlations. In other words, the model is over-simplified. The
repercussion is large errors on both the training and test data. Variance error arises from
overfitting the training set. Models with high variance are not generalized, and as such,
model very small data fluctuations such as system noise. Consequently, these models
exhibit very small errors on the training data and large errors on the test data. There is
generally a trade-off between high bias and high variance, and the ideal case is when the
bias and variance errors are balanced to result in a model which neither under – nor overfits
the data. See Figures 11 and 12.

Figure 11.
Bias and variance
errors trade-off
Evolution of
Six Sigma
DMAIC

Figure 12.
Underfitting (top left),
overfitting (top right)
and “ideal” balanced
fit (bottom)

Quantifying the learning curves (i.e. the training and testing errors with respect to the
training set size) can help the data scientists to properly diagnose whether the model
exhibits high bias/variance errors and decide accordingly which steps should be taken to
improve it. As previously mentioned, a model that cannot sufficiently learn the training data
set underfits the data. This manifests itself in the learning curves with high test and training
errors. Overfitting results in learning curves which exhibit a high testing error and a low
training error, indicating that the model is not generalized and only performs well on the
training data. The ideal case will result in learning curves where both the training and
testing losses decrease to a point of stability, where they are sufficiently low and only
separated by a small gap. See Figure 13.
Once the learning curves have been quantified, the following actions are recommended if
high bias/variance problems persist (Hastie et al., 2001; Luz, 2017). If the model exhibits high
bias, adding/creating more features or decreasing the regularization parameter l (L1
penalty) will increase model complexity. This will, therefore, contribute to mitigating the
underfitting problem. On the other hand, if the model exhibits high variance, more data
should be generated to induce more parsimony. Less features can also be included by
removing trivial ones or the regularization parameter can be increased.
Validation is necessary to ensure a trustworthy model by estimating its generalized
unbiased performance (Grootendorst, 2019). At the bare minimum, any robust MLA must be
validated by splitting the data. This enables model performance evaluation on exposure to
previously unseen data. The split percentage depends on the particular project objectives,
including computational cost and training/test set “representiveness.” The test error in this
method may be highly variable since it depends on which observations are included in the
training and validations sets. A common split percentage is that in which 70% of the data is
used for training and 30% of the data is used for testing (Figure 14).
IJLSS

Figure 13.
Learning curves for
high bias (top left),
high variance (top
right) and “ideal” case
(bottom)

Figure 14.
Train Test
Validation (70/30%)

Optimizing model hyperparameters (parameters whose values are set before the learning
process) on data split into training and testing invites the risk of overfitting the model on
that specific split data set. To avoid this, it is advisable to further split the data into three
sets, where the third set, termed the holdout set, is a set of data not used during
hyperparameter tuning. This enables the model to be validated on the holdout set after
model optimization (which uses the train/test data) to avoid the issue of overfitting. This
new split ratio could look something like the following: 60% training data, 20% testing data
and 20% holdout data (Figure 15).
Though data splitting should be performed using random subsets of the data, it is
possible for sampling biases to occur. Sampling bias will introduce error by over-
representing one or more subsets of the population while under-representing others. To
avoid this, k-Fold Cross-Validation (CV) can be used. The reader is referred to Brownlee
(2018), who provides a straightforward introduction to the k-Fold CV technique. In this
technique, the data are split into k groups or folds. The model is trained using k-1 of the folds
as training data. The remaining portion of the data is apportioned to test data. This is done
for all unique groups. The advantage of this method over repeated random subsampling is
that all samples are used for both training and testing, and each observation is used for
testing exactly once. Five to 10-fold CVs are commonly used (McLachlan et al., 2005). The

Figure 15.
Train Test Holdout
Holdout (60/20/20%)
higher value of k leads to less bias – but large variance might lead to overfit – whereas a Evolution of
lower value of k is similar to the train-test split approach (Sanjay, 2018). Notably, applying Six Sigma
the k-Fold CV technique to time series data runs the risk of overfitting the data (since
training data may contain future information). To guard against this, the model developer
DMAIC
should ensure that the training data takes place prior to the test data (Figure 16).
The “Leave-One-Out Cross-Validation” (LOOCV) technique is a special case of the k-Fold
CV technique where k is set equal to the number of observations. In LOOCV, each sample in
the data set is treated as its own test set, and the remaining samples form the training set.
This method is approximately unbiased, but it tends to exhibit high variance. However, it is
preferred over CV in scarcity of data. In this condition, CV is likely to exhibit high variance
as well as a higher bias because the training set is smaller than LOOCV. Therefore, this
method is only practical on “small” data sets (Figure 17).

3.4 Prediction ability evaluation


Classifiers commit misclassifications because prediction is performed under uncertainty.
These errors are investigated to better understand the predictability. This information is
used to determine whether or not the learning objectives of the project have been met. A
positive result in a binary classification of quality problems refers to a defective item, while
a negative result refers to a good quality item. The confusion matrix (CM) (Fawcett, 2006) is
a table used to summarize the predictive performance of a classifier (Table 2).

Train Test Holdout

Split 1 Performance 1

Split 2 Performance 2

Split 3 Performance 3
Figure 16.
Split 4 Performance 4 Cross-validation
(k = 5)
Split 5 Performance 5

Training Test

Split 1 Performance 1

Split 2 Performance 2

Split 3 Performance 3
. Figure 17.
. Leave-one-out cross-
. validation (n = 20)
Split n Performance n

Table 2.
Confusion matrix Predicted good Predicted defective
CM for binary
Good item True Negative (TN) False Positive (FP) classification of
Defective item False Negative (FN) True Positive (TP) quality
IJLSS Figure 18 illustrates this concept. A green triangle denotes the good class, whereas a red
circle denotes the defective class. When the classification threshold (CT) is set to 0, the
negative values represent the good region, while the positive values represent the defective
region. The TP is a defective item that has been correctly classified, and the FP is a good
quality item classified as a defective. The TN is a good quality item correctly classified,
whereas the FN is a defective item not detected.
The type-I (a) error refers to the FP rate, and the type-II ( b ) error defines the FN rate (i.e.
missing rate); b is the probability that a defective item will be missed by the classifier. The
a and b errors (Devore, 2015) are computed as follows:
FP FN
a¼ ; b ¼
FP þ TN FN þ TP
Statistically, these two types of error rates pose a trade-off; where they cannot be
simultaneously minimized. As demonstrated in Figure 19, reducing one type of error –
moving the CT right or left – results in increasing the other one.
In the manufacturing context, the b error is used to determine the defect detection ability of
the classifier. In contrast, the a error is used to determine if the inefficiency created by the FP
rate can be handled by the plant. To reduce warranties, the b error has higher importance.
The maximum probability of correct decision (MPCD) is used to address the trade-off
posed by the a and b errors. It is a probabilistic-based measure of classification
performance aimed at analyzing highly/ultra-unbalanced data structures (Abell et al., 2014,
2017). It is sensitive to the recognition rate by class; therefore, in a high-conformance

Product quality

Good

Figure 18.
Graphical illustration TN FN TP FP
of the CM CT
-5 -4 -3 -2 -1 0 1 2 3 4 5

Figure 19.
Alpha-beta trade-off
production rate, its score mainly describes the ability of a classifier to detect the defective Evolution of
class. The a and b values are combined as follows: Six Sigma
MPCD ¼ ð1  aÞð1  b Þ DMAIC
MPCD [ [0, 1], where a higher score indicates better classification performance.

3.5 Multiple classifier system


A MCS is a powerful solution to difficult pattern recognition problems, outperforming the
best individual classifier (Clemen, 1989). The primary causes of error (i.e. miss-classifications)
are noise, bias and variance. Ensemble learning (e.g. bagging and boosting) produces a more
reliable classification than a single classifier, and thus reduces these sources of errors. To
design a MCS, an appropriate fusion rule is required to combine the individual classifier
outputs optimally to determine the final decision (classification). Majority, simple majority
and unanimous voting are the most common fusions (Wozniak et al., 2014).
In MCS, heterogeneous or homogeneous models are integrated (e.g. LR, SVM, RF) to
exploit the strengths of each individual classifier and to overcome the limitations of an
optimal local solution. Furthermore, diversity aids in lowering the classifier output
correlation (Zenobi and Cunningham, 2001) (Krogh and Vedelsby, 1995) and provide better
options to explore different fusion rules (Escobar et al., 2021a, 2021b, 2021c, 2021d, 2021e).

4. Literature review – machine learning in quality control


The concept of zero defects in manufacturing reemerged in 2013 when Ke-Sheng (Wang,
2013) presented a general framework about how to apply data mining in manufacturing.
The author described the basic components of a quality monitoring system aimed at fault
diagnosis or failure prognosis. A pilot study, three-dimensional intelligent quality inspection
system, was also performed to support their framework. Three MLA were used to perform
the quality inspection (SVM, ANN, Classification Tree) and k-fold cross-validation was
applied to evaluate the performance of each classifier. The best classifier (SVM) was
selected based on correctness, i.e. percentage of samples correctly classified and the FP rate.
The proposed framework does not support the analysis of highly/ultra-unbalanced data
structures, and it assumes no time effect since cross-validation is used. Finally, no decision
combination scheme aimed at improving prediction was reported.
A hybrid PM and diagnosis method based on cluster analysis and the SVM algorithm
was developed by Wuest et al. in 2014 (Wuest et al., 2014). In this work, the authors proposed
a “red flag” warning system aimed at modeling the dependencies between the different
states or state characteristics. The approach allows for experts to decide whether the
product can still reach the final quality requirements or if it should be scrapped. Clustering
analysis was applied to identify potential undesirable states by isolating extreme states and
the SVM to determine, in quasi real-time, the process/product states at various points in the
overall manufacturing system. The authors concluded that PM based on clustering analysis
and supervised ML on product state-based data can potentially increase quality in
manufacturing.
In 2017, Granstedt Möller (2017) presented a ML application for QC in Scania Engines
based on the Convolutional Neural Network algorithm, where the objective was detection in
highly unbalanced classes to separate bad from good. A general learning approach was
followed based on k-fold cross-validation; therefore, the time-effect was assumed negligible.
Preliminary prediction performances were reported in terms of type-I and type-II errors;
however, relative good detection was achieved at the expense of high type-I error. The
IJLSS author recognized that quality from the production process does not get better directly by
detection; however, it increases the ability to control the production and, therefore, prevents
defects from reaching the customers.
In 2019, Said et al. (2019) proposed a new method for ML flaw detection using the reduced
kernel partial least squares method to handle nonlinear dynamic systems. Their research
resulted in a time reduction calculation and a decrease in the rate of FNs. Malaca et al. (2019)
designed a system to enable the real-time classification of automotive fabric textures in low
lighting conditions.
In 2020, Xu and Zhu (2020) used the concept of fog computing to design a classification
system to identify potential flawed products within the production process. Cai et al. (2020)
proposed a hybrid information system based on an ANN of short-term memory to predict or
estimate wear and tear on manufacturing tools. Their results showed outstanding
performance for different operating conditions. Hui et al. (2020) designed data-based
modeling to assess the quality of a linear axis assembly based on normalized information
and random sampling. They used the replacement variable selection method, synthetic
minority oversampling technique and optimized multiclass SVM.
The presented applications include classifier development techniques, fault detection, PM
or QC systems. The common denominator is that they are all data-driven approaches based
on MLA. However, in most cases, a particular problem is solved. Therefore, unless quality
practitioners have a similar problem, they cannot develop a high impact Quality 4.0 initiative
from what is reported. In this context, we introduce PMQ to the quality community.

5. Process monitoring for quality – the quality 4.0 initiative


In Industry 4.0, PMQ (Abell et al., 2017) is a Quality 4.0 initiative that systematically guides
the application of AI approach to IBD to generate value (Figure 20). It is a blend of PM and

Industry 4.0

Quality 4.0
PMQ

Figure 20.
PMQ in the context of
Industry 4.0
QC, aimed at real-time defect detection and empirical knowledge discovery, where detection Evolution of
is formulated as a binary classification problem. Six Sigma
PMQ is based on BM (Escobar et al., 2018a, 2018b), a predictive modeling paradigm that
applies ML, statistics and optimization to process data to develop the classifier (Figure 21).
DMAIC
Data mining – empirical – results help identify the system’s driving features and uncover
hidden patterns (Chandrashekar and Sahin, 2014; Hua et al., 2004). Domain knowledge
experts further investigate this information to generate a new set of hypotheses tested by
experimental means, i.e. the design of experiments. Discovered information is used to
augment human troubleshooting and guide process redesign and improvement. Figure 22
depicts the conceptual framework of PMQ.
Customers expect perfect quality, although most manufacturing processes generate only
a few DPMO. A single warranty event can significantly impact the company’s reputation.
Therefore, rare quality event detection is one of the relevant challenges addressed by PMQ.
The BM learning paradigm is founded on ad hoc learning approaches to analyze these data
structures effectively.

5.1 Applications
Though quality inspections are widely practiced before/during/after production and, they
still heavily rely on human capabilities (limitations). According to a recent survey,
inspections were manual primarily (i.e. less than 10% automated) (Belfiore, 2016). PMQ
proposes to use real-time process data to monitor and control the processes automatically.
This application has the desirable characteristics of a ML project, where the primary goal is

Data optimization
computation
BIG
DATA

analysis

machine
learning optimization
BIG
MODELS

statistics Figure 21.


Big data BM concept
IJLSS

Figure 22.
PMQ conceptual
framework

to learn a repetitive, simple mental concept performed by inspectors, where the task is
formulated as a binary classification problem.
To establish how PMQ advances the State-of-the-Art of quality, three conventional QC
scenarios without AI are investigated in Figure 23. Then, their counterparts are presented in
Figure 24.
A typical manufacturing process, shown in Figure 23, produces only a few DPMO. The
majority of these defects are detected (TP) using either a manual/visual inspection, Figure 23(a)
or by a SPC/SQC system [Figure 23(b)]. Detected defects are removed from the value-adding
process for a second evaluation, where they are finally either reworked or scrapped. Given that
neither inspection approaches are 100% reliable (See, 2015; Wuest et al., 2013), they can commit
FP (i.e. call a good item defective) and FN (i.e. call a defective item good) errors; whereas FP
causes the hidden factory effect by reducing the efficiency of the process, FN should always be
avoided.
In extreme cases, Figure 23(c), time-to-market pressures may compel a new process to be
developed and launched even before it is understood from a physics perspective. Even if a
new SPC/SQC model/system is developed or a preexisting model or system is used, it may
not be feasible to measure its quality characteristics (variables) within the time constraints
of the cycle time. In these intractable or infeasible cases, the product is launched at high risk
for the manufacturing company.
The BM learning paradigm is used to design a classifier with high defect detection
ability to be deployed at the plant, e.g. final model (Figure 24). This data-driven method is
used to eliminate manual/visual inspections and to develop an empirical-based QC system
for the intractable and unfeasible cases [Figure 24(a)].
In a statistically controlled process, PMQ is used to detect those few DPMO (FN) not
detected by the SPC/SQC system to enable the creation of virtually defect-free processes
Defecve Rework/
Evolution of
Scrap
(TP) revaluaon Good
Six Sigma
(TN) DMAIC
“Defecve”
items (TP,FP)

Manufactured Manual/visual “Good” Good


Process Worn-out
items (good,defecve) inspecon items (TN,FN) (TN)
Customers
Highly A few Defect
unbalanced DPMO (FN)

Warranty
(a)

Defecve Rework/
Scrap
(TP) revaluaon Good
(TN)

“Defecve”
items (TP,FP)

Manufactured SPC/SQC “Good” Good


Process Worn-out
items (good,defecve) inspecon items (TN,FN) (TN)

Customers

Highly A few Defect


Explanatory (FN)
unbalanced model DPMO

Experimental data Stascs Warranty

(b)
Manufactured Good
Process Worn-out
items (good,defecve) (TN)
Customers

Highly Defect
unbalanced (FN)

Warranty

(c) Figure 23.


Notes: (a) Manual/visual control; (b) statistical control; (c) Intractable/unfeasible Traditional QC
scenarios
control
IJLSS Defecve Rework/
Scrap Good
(TP) revaluaon
(TN)

“Defecve”
items (TP,FP)

Manufactured “Good” Good


Process PMQ CustomersWorn-out
items (good,defecve) items (TN,FN) (TN)
Customers
A few
Highly Predicve DPMO Defect
unbalanced model (FN)
(classifier)

Observaonal data Warranty


BM
(IBD)

(a)

Defecve Rework/ Good


Scrap
(TP) revaluaon (TN)
“Suspect”
(TN,FN)
“Defecve”
items (TP,FP)

Manufactured “Good” Good


Process SPC PMQ Worn-out
Items (good,defecve) items (TN,FN) (TN)
Customers
Highly A few Defect-free
unbalanced Explanatory DPMO process
model
Predicve
Experimental data model
Stascs (classifier)

Observaonal data
BM
(IBD)

(b)
Figure 24. Notes: (a) Empirical-based (data-driven) control; (b) boost a process statistically under
PMQ applications
control

through perfect detection (Escobar et al., 2020) [Figure 24(b)]. A full analysis of the PMQ
applications is provided in (Escobar et al., 2018b, 2018a, 2021a, 2021b, 2021c, 2021d, 2021e).

6. The problem-solving strategy


PMQ uses a seven-step problem-solving strategy to drive innovation, Figure 25; IADLPR2
(Escobar et al., 2021a, 2021b, 2021c, 2021d, 2021e). IADLPR2 is based on theory and our
knowledge of complex manufacturing systems. According to empirical results, this strategy
Evolution of
Six Sigma
DMAIC

Figure 25.
PMQ problem-
solving strategy

increases the chances of successfully deploying the PMQ Quality 4.0 initiative. Each of the
steps is briefly explained, and relevant references are provided:
 Identify, the primary goal of this step is to evaluate each of the potential projects to
select high-value complex engineering problems. First, each project is assessed
based on 18 questions (Figure 26). Then, the weighted project decision matrix is
applied to identify the best projects (Figure 27) (Escobar et al., 2021a, 2021b, 2021c,
2021d, 2021e). Next, potential projects are assessed based on data availability,
business value and chances of success. Finally, once the project has been selected,
the learning goals must be defined in terms of the alpha and beta errors to assess
feasibility.
 Acsensorize, the primary goal of this step is to observe (i.e. deploy cameras or
sensors) the process to generate the raw empirical data to monitor the system (Abell
et al., 2017; EY Oxford Analytica, 2019).
IJLSS

Figure 26.
Project evaluation
questions

 Discover, the primary goal of this step is to create the training data, i.e. create
features from the raw empirical data (Huan and Motoda, 1998; Boubchir et al., 2017)
and to label each of the samples.
 Learn, the primary goal of this step is to design the classifier using the BM learning
paradigm (Escobar et al., 2018a, 2018b). This step includes preprocessing
techniques, e.g. outlier detection, normalization/standardization, feature selection,
imputations, transformations, etc., and training the eight MLA described in Table 1.
 Predict, the primary goal of this step is to develop a MCS to optimize prediction, i.e.
improve the prediction ability of the top performer by combining two or more classifiers.
An ad hoc algorithm is presented in Escobar et al. (2021a, 2021b, 2021c, 2021d, 2021e).
 Redesign, the primary goal of this step is to derive engineering knowledge from the
data mining results. The extracted information is used to generate useful
hypotheses about possible connections between the features and the quality of the
product. Statistical analyses can be designed to establish causality, supplement and
identify root-cause analyses and to identify optimal parameters to redesign the
process (Escobar et al., 2021a, 2021b, 2021c, 2021d, 2021e).
 Relearn, the main goal of this step is to develop a relearning strategy for the
classifier to learn the new statistical distributions of the classes. In ML, this is
known as the concept of drift (Webb et al., 2017; Wang and Abraham, 2015). This
step specifies how the retraining data are generated and how frequently the Evolution of
classifier is retrained. Typically, the plant dynamics are considered when Six Sigma
developing a relearning strategy (Escobar et al., 2021a, 2021b, 2021c, 2021d, 2021e).
DMAIC

7. Real case study


The Ultrasonic Welding of Battery Tabs (UWBT) is investigated; the UWBT technology
was highly reliable but poorly understood. This situation posed an intellectual and business
challenge since all the welds in the vehicle must be good for the electric motor to work.
However, no QC method had been developed at the time to ensure this level of process
conformance:
(1) Identify, because of the future of vehicle electrification, the UWBT is a very important
project. Weld quality is defined by strength. The problem is formulated as a binary
classification problem, i.e. discriminate good vs defective welds (w.r.t. welding
strength). Five projects are evaluated based on the 18 questions. According to the
weighted project decision matrix, project 2 (UWBT) had the greatest score (306), and
was therefore selected, Figure 28, while project 5 was ranked in second place.

In this project, under-body deviations from nominal are used to early predict significant
deviations in final assembly. This is also another important project driving the
innovation of vision systems applied to body-in-white dimensional quality systems.
Whereas both projects obtained similar scores across the questionnaire, a couple of
questions helped rank the UWBT project higher. For example, in question number four
(w4 = 9), it was observed that the dimensional system was exposed to many unrecorded
sources of variation between the under-body data and final assembly data. Therefore,

Figure 27.
Weighted project
decision matrix

Figure 28.
Weighted project
decision matrix
IJLSS potential relationships between the predictor variables (features) and the response
variables may be corrupted in intermediate steps. In this question, the UWBT project
obtained a p4 = 9 (81 total points in this question), whereas the dimensional project
obtained a p4 = 1 (9 points in total). Full description of the project and preliminary results
are described in Escobar et al. (2021a, 2021b, 2021c, 2021d, 2021e). This paper also
discusses the effect of the unrecorded sources of variation:
(1) Acsensorize, to ensure that all the physical aspects of the process are captured,
three sensors are deployed based on engineering knowledge.
(2) Linear variable differential transformer (LVT): a measure of displacement of the
horn during the welding process.
(3) Delivered power (PWL): the power delivered by the welder during the welding
process.
(4) Acoustic signature (ASO): the acoustic signal produced during the welding process.

Generated signals provide the required information to monitor the process. After comparing
signals from good and defective welds, class separation is expected, as the signals contain
discriminative information (Figure 29). Top images a, b and c describe the PWL (strong step
gradient), LVT and ASO signals derived from good quality welds. Bottom images, d, e and f
describe the PWL (weak step gradient), LVT and ASO signals from defective welds. It is
also observed that ASO signals exhibit discriminative information, as defective welds
generate a bell-shape pattern [Figure 29(f)]:
(1) Discover, the samples are manually labeled, and 54 features were created using the
signal data. Figure 30 shows the feature creation process, whereas Table 3 shows
the training data set. It contains 40,231 samples, including 36 defective welds.
(2) Learn, the data set is partitioned following a holdout validation scheme. Training
set (18,495 – including 20 defective), test set (12,236 – 9) and holdout set (9,500 – 7).
Here, the goal is to detect the seven defective items – zero FNs – in the holdout set
(unseen data) with less than 1% a (FP).

Figure 29.
Acsensorization of
ultrasonic welder and
an example of
observed signals
Evolution of
Time series sensorization Six Sigma
DMAIC
Sample 1
Raw data

Sample n

Feat 1 Feat 2 ... Feat n


Sample 1
Extracted
features

Sample n Figure 30.


Signals to features

Sample Feature 1 Feature 2 Feature 3 . . . Feature 54 Label

1 0.57 0.88 1 0.86 0.51 1.01 1.03 0


2 0.17 0.17 0.25 0.03 0.67 0.39 0.09 1
3 1.11 1.26 1.1 1.31 1.42 1.17 1.15 0
. 0.24 0.18 0.3 0.36 0.32 0.55 0.22 0
. 1.56 1.15 0.33 1.52 1.34 1.27 1.24 1
. 173 1.56 1.47 1.52 1.55 1.44 1.37 0 Table 3.
40,231 0.41 0.5 0.52 0.14 0.46 0.33 0.23 1 Training data set

The eight MLA are trained, Table 4 summarizes the results. The top performer is the SVM,
with an a = 0.0030, b = 0.1111 (one FN) and MPCD = 0.8862:
(1) Predict, the optimization algorithm is used to search for the optimal fusion rule.
The algorithm developed a MCS based on SVM and ANN with a fusion rule of 1; if
the sum of both predictions is greater than 1, the final prediction is positive
(defective), otherwise negative (good) (Table 5). By combining both classifiers, the
MPCD increased from 0.8862 to 0.8870.

Finally, this MCS is applied to the holdout set to evaluate its prediction ability on unseen
data. The MCS shows very good predictive ability (MPCD = 0.9997). It detected the seven
defective items in the holdout set with a very low a error (0.0003) (Table 6):
IJLSS (1) Redesign, the essential features of the quality pattern (welding strength) are identified
using a filter feature selection method (Escobar Diaz et al., 2022). Top features are
derived from the power signal (Figure 31). Then, after an engineering analysis, it was
discovered that the defective items showed a low power slope. Therefore, the

MLA FN FP TN TP Alpha Beta MPCD Top

SVM 1 37 12,190 8 0.0030 0.1111 0.8862 


ANN 1 82 12,145 8 0.0067 0.1111 0.8829
LR 1 116 12,111 8 0.0095 0.1111 0.8805
NB 2 38 12,189 7 0.0031 0.2222 0.7754
RF 2 64 12,163 7 0.0052 0.2222 0.7737
Table 4. KNN 7 0 12,227 2 0.0000 0.7778 0.2222
Prediction results on SVM(RBF) 7 7 12,220 2 0.0006 0.7778 0.2221
the test set RUBSBoot 2 69 12,158 7 0.0056 0.2222 0.7734

Table 5.
Multiple classifier
system based on the Classifiers Fusion rule FN FP TN TP Alpha Beta MPCD
SVM and ANN, with
a fusion rule of 1 SVM ANN 1 1 26 12,201 8 0.0021 0.1111 0.8870

Confusion matrix Predicted good Predicted defective


Table 6.
Holdout CM of the Good item 9,490 3
MCS Defective item 0 7

Figure 31.
Feature ranking
engineering team recommended a power booster for welds with a low slope. The Evolution of
process becomes more robust and almost completely stops generating defective welds. Six Sigma
(2) Relearn, a relearning program is defined based on process dynamics. The DMAIC
reevaluated items (TP, FP) are used to retrain the algorithms every night. Also, an
alerting system is defined to keep track of the a error. The algorithms are retrained
immediately if the a error increases (>1%).

This application illustrates how an empirical (data-driven) solution was developed to


monitor and control the quality of the UWBT process, technology that was incompletely
understood when the Chevrolet Volt was launched. The process was controlled and
improved using AI approaches.
There are several lessons learned from this case study. First, solving the challenged posed
by the UWBT was a priority for General Motors. However, not all high priority problems can
be solved by the proposed methodology. The 18 questions helped to evaluate feasibility. Once
the project was selected, engineering knowledge helped to deploy the right sensors. After the
data (signals) were generated, the analytical team took over the project and applied the ML
methods to develop the predictive system. After deployment, it was observed that prediction
degraded overtime. Therefore, a relearning strategy had to be developed to address this
situation. Finally, after analyzing the statistical distribution of the classes, it was determined
the root-cause of low quality welds. This data-driven study helped to address the problem
from a physics-based perspective. The process was redesigned to virtually eliminate the
defective welds. The seven-step problem-solving strategy captures the lessons learned.

7.1 Practical implications


A vision for AI is developed, data science teams are formed, enabling technologies are
purchased and the high business value is chosen, but the company still receives little to no
value. Unfortunately, this is a common occurrence across industries – manufacturing is no
exception – as many of the selected projects are ill-conditioned from the launch. Although
the 18 questions proposed by the authors help to mitigate this risk, often, the chosen projects
are still intractable even for AI. If this is the case, it is essential for managers to rapidly
identify when to drop a project, learn the lessons and assign a new project to the data science
team. In this context, the seven-step problem-solving strategy is presented in a logic
diagram that helps to identify the Go/No Go critical decision-making steps (Figure 32).
After a project is selected, the data science team is assigned, the learning targets are
defined (identify) and engineering knowledge is used to deploy relevant sensors (acsensorize).
The objective of the sensors is to generate signals with discriminative information that is
captured its features (discover). Then, eight diverse MLA are applied to learn the quality
pattern. Therefore, if a pattern exists, they will learn it. In the next step (predict), a meta-
learning optimization algorithm is applied to optimize the prediction. Here, if the learning
targets are not met, either the project can be abandoned or if the learning techniques were
correctly applied, new sensors or features are most likely required. In this context, predict is
the first tailgate, where many projects are abandoned. Therefore, it is important to reach this
step as soon as possible to evaluate feasibility. Suppose the learning targets are met in the
predict step. In that case, statistical analyses (e.g. randomized experiments) are devised to
establish causality to augment root-cause analyses and to find optimal parameters to
redesign the process. Finally, the next step (relearn) is a very critical one, where many
projects fail as the predictive systems do not sustain. In this step, usually, the model is
transferred from the lab to a real environment, i.e. pilot runs. In a real situation, the
manufacturing processes are often exposed to new and transient sources of variation that are
IJLSS

Figure 32.
Seven-step problem-
solving strategy from
a practical
perspective

not observed in the lab and, therefore, not modeled. Consequently, many solutions do not
sustain in real environments. Nevertheless, if these sources of variation are determined, they
can be observed and modeled; otherwise, the project is abandoned. However, abandoning a
project in the last step is a costly situation; therefore, it is important to select a project for
which it is believed all the relevant data are available [2].
The seven-step problem-solving strategy is not a recipe for success; rather, it is a formal
way to drive innovation and to ensure that the right steps are taken and the major
challenges posed by the manufacturing systems are addressed. Moreover, the Quality 4.0
initiative based on PMQ only applies to discrete manufacturing. Finally, another aspect
worth mentioning is that vision projects are typically solved by Deep Learning
[Convolutional Neural Networks – CNNs (Goodfellow et al., 2016)]; therefore, the seven steps
do not apply to image classification problems[3].
7.2 Managerial implications Evolution of
Nowadays, more businesses want to implement digital transformations, indicating that Six Sigma
investments in this area will increase. According to Celonis, a data and execution
management company, a 2019 survey found that approximately 45% of C-suite executives
DMAIC
agreed on implementing this transformation but lacked a strategy for doing so. In 2019, 34%
of them spent more than $500,000.00 on business transformation strategies. Having a
detailed plan of action, monitoring of internal processes, using clear indicators of how and
where the failures are found can result in better use of resources when migrating to Quality
4.0 (Celonis, 2019).
To create an appropriate path for executives to implement these changes, they must first
observe and analyze the current issues. Technology solves practically every problem, but
not everyone needs all the available tools. As a result, there will always be a gap between
what top managers and analysts observe (Celonis, 2019). Quality 4.0 enables easier
observation of the processes, allowing both parties to make the same decisions and create a
strong foundation for implementing future changes to meet new industry standards.
From to Zairi (2019), there are six main components of the management system to attain
a Quality 4.0 standard. These are:
(1) having an important number of key performance indicators (KPI’s);
(2) having high-quality KPI’s;
(3) using a universal data standard;
(4) designing a reporting system of past, present and future performance to establish
trends for internal tracking of information;
(5) using technological and analytical tools for meaningful and enabling extraction of
information; and
(6) using predictive analytical tools for exploring future opportunities for performance
enhancement.

These characteristics lead to a maturity path, where quality improvement will be goal-
oriented and understood such that the resources will be planned. Some of the benefits of this
implementation are improved product quality, reduced waste, reduced cost, minimized
rework, increased customer satisfaction, improved environment, improved financial results,
etc. (Radziwill, 2020).

8. Conclusions
Manufacturing companies can competitively be recognized among the most advanced and
influential companies in the world by successfully implementing Quality 4.0. However,
based on recent surveys, its successful implementation is one of the most relevant
challenges confronting the fourth industrial revolution.
The supporting technologies, principles and applications of Process Monitoring for
Quality (PMQ), a Quality 4.0 initiative are reviewed. This initiative emphases on real-time
defect detection, in which the detection is formulated as a binary classification problem.
Following the same path of Six Sigma define, measure, analyze, improve, control, an
evolved problem-solving strategy that guides Quality 4.0-based innovation is described. It is
a seven-step approach, IADLPR2, that identifies and addresses the main challenges posed
by manufacturing systems.
AI-based manufacturing innovations are mainly executed in the laboratories by
researchers whose main goals are to develop breakthrough/disruptive technologies in the
IJLSS mid or long-term. This proposal motivates manufacturing/quality engineers to be able to
drive AI-based continuous improvement and short-term innovation using the seven-step
problem-solving strategy for Quality 4.0. An actual case study from General Motors is
presented, where each of the seven steps is applied to solve a real complex problem.
The seven-step problem-solving strategy focuses on solving the classification problem
based on traditional ML, where feature creation (Discovery-step) and process redesign are
important steps. However, there are many applications in which deep learning algorithms
are applied to images to replace human/visual inspections. In these cases, not all the steps
are valid, as there is no feature creation step. Future research along this path could focus on
updating the seven-step problem-solving strategy to adapt to these applications. Basically,
the idea is to have two approaches, one for ML applications and one for deep learning.

Notes
1. Authors acknowledge that some algorithms can change their taxonomy (e.g. from parametric to
nonparametric) depending upon their definition.
2. The 18 questions help with this objective.
3. Redefinition of the seven-step problem-solving strategy for image classification is out of the
scope of this study.

References
Abell, J.A., Spicer, J.P., Wincek, M.A., Wang, H. and Chakraborty, D. (2014), Binary classification of
items of interest in a repeatable process, US Patent, (US8757469B2).
Abell, J.A., Chakraborty, D., Escobar, C.A., Im, K.H., Wegner, D.M. and Wincek, M.A. (2017), “Big data
driven manufacturing – process-monitoring-for-quality philosophy”, ASME Journal of
Manufacturing Science and Engineering on Data Science-Enhanced Manufacturing, Vol. 139 No. 10,
p. 3, 4, 5, 9, 11.
Alzahrani, B., Bahaitham, H., Andejany, M. and Elshennawy, A. (2021), “How ready is higher education
for quality 4.0 transformation according to the lns research framework?”, Sustainability, Vol. 13
No. 9, p. 5169.
Atmaca, E. and Girenes, S.S. (2013), “Lean six sigma methodology and application”, Quality and
Quantity, Vol. 47 No. 4, pp. 2107-2127.
Belfiore, M. (2016), “Automation opportunities abound for quality inspections”, Automation World.
Bishop, C.M. (2006), Pattern Recognition and Machine Learning, Springer, New York, Vol. 4 No. 4,
p. 738.
Boubchir, L., Daachi, B. and Pangracious, V. (2017), “A review of feature extraction for EEG epileptic
seizure detection and classification”, 40th International Conference on Telecommunications and
Signal Processing, IEEE, pp. 456-460.
Boyes, H., Hallaq, B., Cunningham, J. and Watson, T. (2018), “The industrial internet of things (IIoT): an
analysis framework”, Computers in Industry, Vol. 101, pp. 1-12.
Breiman, L. (2001), “Random forests”, Machine Learning, Vol. 45 No. 1, pp. 5-32.
Brownlee, J.A. (2018), “Gentle introduction to k-Fold cross-validation”.
Bughin, J., Seong, J., Manyika, J., Chui, M. and Joshi, R. (2018), “Notes from the AI frontier: modeling the
impact of AI on the world economy”, McKinsey Global Institute.
Cachada, A., Barbosa, J., Leitño, P., Gcraldcs, C.A., Deusdado, L., Costa, J., Teixeira, C., Teixeira, J.,
Moreira, A.H., Moreira, P.M. and Romero, L. (2018), “Maintenance 4.0: intelligent and predictive
maintenance system architecture”, IEEE 23rd International Conference on Emerging Evolution of
Technologies and Factory Automation, IEEE, Vol. 1, pp. 139-146.
Six Sigma
Cai, W., Zhang, W., Hu, X. and Liu, Y. (2020), “A hybrid information model based on long short-term
memory network for tool condition monitoring”, Journal of Intelligent Manufacturing, Vol. 31
DMAIC
No. 6, pp. 1-14.
Celonis. (2019), “Celonis study: almost half of C-suite executives admit to launching transformation
initiatives without a clear strategy”.
Chandrashekar, G. and Sahin, F. (2014), “A survey on feature selection methods”, Computers and
Electrical Engineering, Vol. 40 No. 1, pp. 16-28.
Clemen, R.T. (1989), “Combining forecasts: a review and annotated bibliography”, International Journal
of Forecasting, Vol. 5 No. 4, pp. 559-583.
Cloud, G. (2021), “Managing machine learning projects with google cloud”, Coursera.
Davidson, I. (2004), “An ensemble technique for stable learners with performance bounds”, AAAI,
Vol. 2004, pp. 330-335.
Demuth, H.B., Beale, M.H., Jess, O.D. and Hagan, M.T. (2014), Neural Network Design, Martin Hagan,
2nd edition, ch. 2, 4, pp. 36-60, 80-112.
Deniz, S. and Çimen, M. (2018), “Barriers of six sigma in healthcare organizations”, Management
Science Letters, Vol. 8 No. 9, pp. 885-890.
Devore, J. (2015), Probability and Statistics for Engineering and the Sciences, Cengage Learning, Ch. 16,
pp. 659-660.
Escobar, A., Chakraborty, D., Arinez, J. and Morales-Menendez, R. (2021a), “Augmentation of body-in-
white dimensional quality systems through artificial intelligence”, 2021 IEEE International
Conference on Big Data (Big Data), pp. 1611-1618, IEEE.
Escobar, C.A., Abell, J.A., Hernandez-de-Menéndez, M. and Morales-Menendez, R. (2018a), “Process-
monitoring-for-quality – big models”, Procedia Manufacturing, Vol. 26, pp. 1167-1179.
Escobar, C.A., Wincek, M.A., Chakraborty, D. and Morales-Menendez, R. (2018b), “Process-monitoring-
for-quality – applications”, Manufacturing Letters, Vol. 16, pp. 14-17.
Escobar, C.A., Arinez, J. and Morales-Menendez, R. (2020), “Process-monitoring-for-quality–a step
forward in the zero defects vision”, SAE Technical Paper, number 2020-01-1302, 4.
Escobar, C.A., Macias, D. and Morales-Menendez, R. (2021b), “Process monitoring for quality–a
multiple classifier system for highly unbalanced data”, Heliyon, Vol. 7 No. 10, p. e08123.
Escobar, C.A., Chakraborty, D., McGovern, M., Macias, D. and Morales-Menendez, R. (2021c), “Quality
4.0 – green, black and master black belt curricula”, Procedia Manufacturing, Vol. 53, pp. 748-759,
49th SME North American Manufacturing Research Conference (NAMRC 49, 2021).
Escobar, C.A., Chakraborty, D., McGovern, M., Macias, D. and Morales-Menendez, R. (2021d),
“Quality 4.0–green, black and master black belt curricula”, Procedia Manufacturing, Vol. 53,
pp. 748-759.
Escobar, C.A., McGovern, M. and Morales-Menendez, R. (2021e), “Quality – challenges posed by big
data in manufacturing”, Journal of Intelligent Manufacturing, Vol. 32 No. 8, pp. 2319-2334.
Escobar Diaz, C.A., Arinez, J., Macías Arregoyta, D. and Morales-Menendez, R. (2022), “Process
monitoring for quality-a feature selection method for highly unbalanced data”, International
Journal on Interactive Design and Manufacturing (IJIDeM), (April), pp. 1-16.
EY Oxford Analytica. (2019), “Sensors as drivers of industry 4.0 – a study on Germany, Switzerland
and Austria”.
Fawcett, T. (2006), “An introduction to ROC analysis”, Pattern Recognition Letters, Vol. 27 No. 8,
pp. 861-874.
Frank, R. (2013), “Understanding smart sensors”, Artech House.
IJLSS Fursule, N.V., Bansod, S.V. and Fursule, S.N. (2012), “Understanding the benefits and limitations of six
sigma methodology”, Int J of Scientific and Research Publications, Vol. 2 No. 1, pp. 1-9.
Gartner Research. (2018), “Predicts 2019: Data and analytics strategy”.
Godina, R. and Matias, J.C.O. (2018), “Quality control in the context of industry 4.0”, International Joint
Conference on Industrial Engineering and Operations Management, Springer, pp. 177-187.
Goodfellow, I., Bengio, Y. and Courville, A. (2016), Deep Learning, MIT press, Ch. 9, pp. 326-366.
Granstedt Möller, E. (2017), “The use of machine learningin industrial quality control”.
Grootendorst, M. (2019), “Validating your machine learning model”.
Harrell, F. (2020), “Classification vs. Prediction. Statistical thinking”.
Hastie, T., Friedman, J. and Tibshirani, R. (2001), “Model assessment and selection”, The Elements of
Statistical Learning, pp. 193-224, Springer.
Ho, T.K. (2000), “Complexity of classification problems and comparative advantages of combined
classifiers”, International Workshop on Multiple Classifier Systems, Springer, pp. 97-106.
Hua, J., Xiong, Z., Lowey, J., Suh, E. and Dougherty, E.R. (2004), “Optimal number of features as a function
of sample size for various classification rules”, Bioinformatics, Vol. 21 No. 8, pp. 1509-1515.
Huan, L. and Motoda, H. (1998), Feature Extraction, Construction and Selection: A Data Mining
Perspective, Springer Science & Business Media, Vol. 453, pp. 2-5.
Hui, Y., Mei, X., Jiang, G., Zhao, F., Ma, Z. and Tao, T. (2020), “Assembly quality evaluation for linear
axis of machine tool using data-driven modeling approach”, Journal of Intelligent
Manufacturing, Vol. 33 No. 3, pp. 1-17.
Imandoust, S.B. and Bolandraftar, M. (2013), “Application of K-Nearest neighbor (KNN) approach for
predicting economic events: theoretical background”, International Journal of Engineering
Research and Applications, Vol. 3 No. 5, pp. 605-610.
Intel. (2021), “5 Steps to accelerate value from your industrial IoT data”, SAS.
Javaid, M., Haleem, A., Pratap Singh, R. and Suman, R. (2021), “Significance of quality 4.0 towards
comprehensive enhancement in manufacturing sector”, Sensors International, Vol. 2, p. 100109.
Juran, J.M. (1995), “A history of managing for quality in the United States-part 2”, Quality Digest,
Vol. 15, pp. 34-45.
Krogh, A. and Vedelsby, J. (1995), “Neural network ensembles, cross validation, and active learning”,
Advances in Neural Information Processing Systems, pp. 231-238,
Kusiak, A. (2018), “Smart manufacturing”, International Journal of Production Research, Vol. 56 No. 1-2,
pp. 508-517.
Lahiru, F. (n.d.), “7 V’s of big data”, All About Data Warehousing, Data Mining; BI, 17 January 2017,
available at: http://blogsofdatawarehousing.blogspot.com/2017/01/7-vs-of-big-data.html
Lee, K. (2000), “IEEE 1451: a standard in support of smart transducer networking”, Proc of the 17th
IEEE Instrumentation and Measurement Technology Conf [Cat. No. 00CH37066], Vol. 2,
pp. 525-528, IEEE,
Lee, S.I., Lee, H., Abbeel, P. and Ng, A.Y. (2006), “Efficient L1 regularized logistic regression”,
Proceeding of the National Conference on Artificial Intelligence, Vol. 21, p. 401, Cambridge, MA.
Loukas, S. (2020), “What is machine learning: supervised, unsupervised, semi-supervised and
reinforcement learning methods? Towards data science”.
Lund, H., Østergaard, P.A., Connolly, D. and Mathiesen, B.V. (2017), “Smart energy and smart energy
systems”, Energy, Vol. 137, pp. 556-565.
Luz, A. (2017), “Why you should be plotting learning curves in your next machine learning project.
Towards data science”.
McLachlan, G.J., Do, K.A. and Ambroise, C. (2005), Analyzing Microarray Gene Expression Data,
Vol. 422, John Wiley and Sons.
Maguad, B.A. (2006), “The modern quality movement: origins, development and trends”, Total Quality Evolution of
Management and Business Excellence, Vol. 17 No. 2, pp. 179-203.
Six Sigma
Malaca, P., Luis, F., Rocha, D., Gomes, J., Silva, G. and Veiga, (2019), “Online inspection system based on
machine learning techniques: real case study of fabric textures classification for the automotive
DMAIC
industry”, Journal of Intelligent Manufacturing, Vol. 30 No. 1, pp. 351-361.
Mândru, L.I.D.I.A., Patrascu, L.U.C.I.A.N., Carstea, C.G., Popesku, A. and Birsan, O.V.I.D.I.U. (2011),
“Paradigms of total quality management”, Recent Researched in Manufacturing Engineering,
121-126, Transilvania University of Bras ov, Romania.
Moen, R.D. and Norman, C.L. (2010), “Circling back”, Quality Progress, Vol. 43 No. 11, p. 22.
Montgomery, D. (2017), “Exploring observational data”, Quality and Reliability Engineering
International, Vol. 33 No. 8, pp. 1639-1640.
Montgomery, D.C. (2014), “Big data and the quality profession”, Quality and Reliability Engineering
International, Vol. 30 No. 4, pp. 447-447.
Murphy, K.P. (2012), Machine Learning: A Probabilistic Perspective, MIT press.
Ng, A. (2019), “How to choose your first AI project”, Artificial Intelligence: The Insights You Need from
Harvard Business Review, pp. 79-88.
Pannu, A. (2015), “Artificial intelligence and its application in different areas”, Artificial Intelligence,
Vol. 4 No. 10, pp. 79-84.
Permission, S. (2005), “Generative and discriminative classifiers: naive Bayes and logistic regression”.
Prem, E. (2019), “Artificial intelligence for innovation in Austria”, Technology Innovation Management
Review, Vol. 9 No. 12, p. 11.
Radziwill, N.M. (2018), “Quality 4.0: let’s get digital-the many ways the fourth industrial revolution is
reshaping the way we think about quality”, arXiv preprint arXiv:1810.07829,
Radziwill, N.M. (2020), Connected, Intelligent, Automated: The Definitive Guide to Digital
Transformation and Quality 4.0, Quality Press, p. 24, 152.
Raschka, S. and Mirjalili, V. (2017), Python Machine Learning, Packt Publishing, Chapter 2.
Ray, S. (2017), “Understanding support vector machine algorithm from examples (along with code)”, July 7.
Sader, S., Husti, I. and Daroczi, M. (2021), “A review of quality 4.0: definitions, features, technologies,
applications, and challenges”, Total Quality Management and Business Excellence, pp. 1-19.
Said, M., Abdellafou, K. B. and Taouali, O. (2019), “Machine learning technique for data-driven fault
detection of nonlinear processes”, Journal of Intelligent Manufacturing, Vol. 31 No. 4, pp. 1-20.
Sanjay, M. (2018), “Why and how to cross validate a model? Towards data science”.
Schütze, A., Helwig, N. and Schneider, T. (2018), “Sensors 4.0–smart sensors and measurement
technology enable industry 4.0”, Journal of Sensors and Sensor Systems, Vol. 7 No. 1,
pp. 359-371.
See, J.E. (2015), “Visual inspection reliability for precision manufactured parts”, Human Factors: The
Journal of the Human Factors and Ergonomics Society, Vol. 57 No. 8, pp. 1427-1442.
Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J. and Napolitano, A. (2009), “RUSBoost: a hybrid approach
to alleviating class imbalance”, IEEE Transactions on Systems, Man, and Cybernetics - Part A:
Systems and Humans, Vol. 40 No. 1, pp. 185-197.
Shakeri, S. (2020), The Future of Quality Jobs: Quality 4.0 New Opportunities for Quality Experts,
Amazon Digital Services LLC – KDP Print US, pp. 37-46.
Sharma, G.V.S.S., Rao, P.S. and Babu, B.S. (2018), “Process capability improvement through DMAIC for
aluminum alloy wheel machining”, Journal of Industrial Engineering International, Vol. 14 No. 2,
pp. 213-226.
Shmueli, G. (2010), “To explain or to predict?”, Statistical Science, Vol 25 No. 3, pp. 289-310.
Singh, S. (2018), “Understanding the Bias-Variance tradeoff”.
IJLSS Sreenivasulu Reddy, A., Sunil, Y. and Madhavi Reddy, G.V. (2022), “Study on application of six sigma
in shoe manufacturing industry”, International Journal of Research in Engineering and Science
(IJRES), Vol. 9 No. 4, pp. 15-23.
Staff, V.B. (2019), “Why do 87% of data science projects never make it into production?”.
Theodoridis, S. and Koutroumbas, K. (2001), “Pattern recognition and neural networks”, Machine
Learning and Its Applications, pp. 169-195, Springer.
Uddin, M.F. and Gupta, N. (2014), “Seven v’s of big data understanding big data to extract value”,
Proceedings of the 2014 zone 1 conference of the American Society for Engineering Education,
IEEE, pp. 1-5.
Valente Klaine, P., Ali Imran, M., Onireti, O. and Demo Souza, R. (2017), “A survey of machine learning
techniques applied to self organizing cellular networks”, IEEE Communications Surveys and
Tutorials, Vol. 19 No. 4, pp. 2392-2431.
Veeramachaneni, K., O’Reilly, U.M. and Taylor, C. (2014), “Towards feature engineering at scale for
data from massive open online courses”, arXiv preprint arXiv:1407.5238.
Venkatesh, N. and Sumangala, C. (2018), “Success of manufacturing industries–role of six sigma”,
MATEC Web of Conferences, Vol. 144, EDP Sciences, p. 5002.
Ventura Carvalho, A., Valle Enrique, D., Chouchene, A. and Charrua-Santos, F. (2021), “Quality 4.0: an
overview”, Procedia Computer Science, Vol. 181, pp. 341-346.
Villanova University. (2020), “Six sigma or big data? why not both?”, Online, October.
Wang, H. and Abraham, Z. (2015), “Concept drift detection for streaming data”, International Joint
Conference on Neural Networks, IEEE, pp. 1-9.
Wang, K.-S. (2013), “Towards zero-defect manufacturing (zdm)–a data mining approach”, Advances in
Manufacturing, Vol. 1 No. 1, pp. 62-74.
Wasserman, L. (2013), All of Statistics: A Concise Course in Statistical Inference, Springer Science and
Business Media, Ch. 2, pp. 149-173.
Webb, G.I., Lee, L.K., Petitjean, F. and Goethals, B. (2017), “Understanding concept drift”, arXiv preprint
arXiv:1704.00362.
West, D. and Allen, J. (2018), “How artificial intelligence is transforming the world”, Brookings.
White, H. (2000), “A reality check for data snooping”, Econometrica, Vol. 68 No. 5, pp. 1097-1126.
Wolpert, D.H. (1996), “The lack of a priori distinctions between learning algorithms”, Neural
Computation, Vol. 8 No. 7, pp. 1341-1390.
Wozniak, M., Graña, M. and Corchado, E. (2014), “A survey of multiple classifier systems as hybrid
systems”, Information Fusion, Vol. 16, pp. 3-17.
Wu, L., Yue, X., Jin, A. and Yen, D.C. (2016), “Smart supply chain management: a review and
implications for future research”, The International Journal of Logistics Management, Vol. 27
No. 2, pp. 395-417.
Wu, X., Kumar, V., Ross, Q.J., Ghosh, J., Yang, Q., Motoda, H. and Steinberg, D. (2008), “Top 10
algorithms in data mining”, Knowledge and Information Systems, Vol. 14 No. 1, pp. 1-37.
Wuest, T., Irgens, C. and Thoben, K.D. (2013), “An approach to quality monitoring in manufacturing
using supervised machine learning on product state based data”, Journal of Intelligent
Manufacturing, Vol. 25 No. 5, p. 173.
Wuest, T., Irgens, C. and Thoben, K.-D. (2014), “An approach to monitoring quality in manufacturing
using supervised machine learning on product state data”, Journal of Intelligent Manufacturing,
Vol. 25 No. 5, pp. 1167-1180.
Xu, C. and Zhu, G. (2020), “Intelligent manufacturing lie group machine learning: real-time and efficient
inspection system based on fog computing”, Journal of Intelligent Manufacturing, Vol. 32,
pp. 1-13.
Zairi, M. (2018), Deep in Crisis: The Uncertain Future of the Quality Profession, Quality Series, Evolution of
Independently published.
Six Sigma
Zairi, P.M. (2019), Leading into the Future through Excellence: An Assessment Guide, Quality 4. 0 Series,
Independently Published, Ch. 2, 3. DMAIC
Zenobi, G. and Cunningham, P. (2001), “Using diversity in preparing ensembles of classifiers based on
different feature subsets to minimize generalization error”, European Conference on Machine
Learning, Springer, pp. 576-587.
Zonnenshain, A. and Kenett, R.S. (2020), “Quality 4.0 – the challenging future of quality engineering”,
Quality Engineering, Vol. 32 No. 4, pp. 614-626.

Corresponding author
Carlos Alberto Escobar can be contacted at: [email protected]

For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: [email protected]

View publication stats

You might also like