Bypassing Data Issues of A Supply Chain Simulation Mode 2020 Procedia Manufa
Bypassing Data Issues of A Supply Chain Simulation Mode 2020 Procedia Manufa
com
ScienceDirect
Procedia Manufacturing 42 (2020) 132–139
Abstract
Supply Chains (SCs) are complex and dynamic networks, where certain events may cause severe problems. To avoid them, simulation can be
used, allowing the uncertainty of these systems to be considered. Furthermore, the data that is generated at increasingly high volumes, velocities
and varieties by relevant data sources allow, on one hand, the simulation model to capture all the relevant elements. While developing such
solution, due to the inherent use of simulation, several data issues were identified and bypassed, so that the incorporated elements comprise a
coherent SC simulation model. Thus, the purpose of this paper is to present the main issues that were faced, and discuss how these were bypassed,
while working on a SC simulation model in a Big Data context and using real industrial data from an automotive electronics SC. This paper
highlights the role of simulation in this task, since it worked as a semantic validator of the data. Moreover, this paper also presents the results that
can be obtained from the developed model.
Keywords: Simulation; Supply Chain; Big Data; Data issues; Industry 4.0
2. Related Work
3.2. Methods
4.5. Transit and lead times The organization measures its stock level by assessing the
percentage of occupied bins. However, while running the
Some transit time durations preclude the arrival of orders at simulation model, several cases in which these premises were
the date specified in the data. In these situations, both transit not verified could be observed. Thus, since simulation allowed
and lead time were estimated. Moreover, these problems were to discover these problems in data, it would also be interesting
handled in conjunction, since transit time can be considered to use it to understand the scale of these problems. Thus, the
part of total lead time. In light of this, first, it is verified if simulation model was used to record the percentage of
there is a transit time specified for a given entity. If not, it is movements that do not follow the storage strategy of the plant
estimated based on the transit times of other suppliers from the and the obtained results can be seen in Fig. 1. These
same country. Afterwards, it is verified if the transit time percentages were registered after the first move to each bin.
allows the associated entities to arrive to the plant at the As the figure shows, the percentage in the movements out
date in the data. If the durations are not adequate, the lead of the warehouse maintains the same level throughout the year,
time and transit time values are adjusted to allow the entity to with the exception of one day at the end of the year.
arrive to the plant at the arrival date specified in the data. This Conversely, the number of failed movements to the warehouse,
approach does not influence the results, as the total lead time on average, is higher throughout the year.
remains the same. After trying to understand this problem with process
experts, two main justifications arose. The first is that not all
movements are registered. The second is that movements are
4.6. Internal material movements registered with a wrong date. For instance, a material may be
consumed, but its consumption register is not immediately
One of the benefits of including all material movements is created (or is created with a wrong date), hence movements
to model the storage strategy followed at the plant and, hence, appear in the wrong order. This problem, in fact, demanded a
measure the stock level. In the considered plant, such strategy change in the approach to model the warehouse. While the
implies that materials are stored in an empty storage location. ideal approach would be to have a data structure comprised
Hence, such strategy implies the following two premises: by a position for each bin of the warehouse, this was not
possible due to this data issue. Thus, the solution was to
When a material is moved to a bin, the bin is empty before measure the variation of the total quantity of material in the
this movement occurs; plant.
When a material is moved out of a bin, the material needs
to have been previously stored in the same bin.
Fig. 1. Percentage of movements not consistent with the storage strategy followed at the plant.
𝑄𝑄"#$%&'() ∗ 𝐿𝐿𝑇𝑇
stock of each material can be used as the stock to start the
(2) simulation. The same method was also analyzed by Schmidt et
𝑄𝑄"#$%&'() ∗ 𝑆𝑆𝑆𝑆3/'(
al. [14] in their review of safety stock calculation methods.
(3) This problem, in fact, remains as one of the hottest and more
complex research topics in the field [13], [14]. Besides the
𝑆𝑆𝑆𝑆(𝑆𝑆𝐿𝐿) ∗ 7𝐿𝐿𝑇𝑇 ∗ 𝑆𝑆𝑆𝑆(𝑄𝑄9"#$%&'())9 + 𝑄𝑄"#$%&'()
above approaches, the following were also considered:
∗ 𝑆𝑆𝑆𝑆(𝐿𝐿𝑇𝑇)9 (4)
A: Sum of all consumptions;
B: Quantity difference between all consumptions and all
with 𝑄𝑄"#$%&'() , average consumed quantity; 𝑇𝑇𝑇𝑇
%&--./(0% , average time between orders to suppliers; 𝐿𝐿𝑇𝑇 ,
arrivals;
average lead time; 𝑆𝑆𝑆𝑆3/'( , safety stock in time; 𝑆𝑆𝑆𝑆(𝑆𝑆𝐿𝐿), C: Sum of all consumptions until the first arrival of each
safety factor for material;
serviceon
based level, which indistributed
a normally this case was considered
demand, to be 99,9%,
thus obtaining the No initial stock.
value 3,9; 𝑆𝑆𝑆𝑆(𝑄𝑄"#$%&'() ), standard deviation for consumed
quantity; 𝑆𝑆𝑆𝑆(𝐿𝐿𝑇𝑇), standard deviation for lead time. Fig. 2 shows the evolution of the stock for each
Expression 4 was obtained from the literature and, as implemented method. The graph shows the stock approaches
suggested by Ruiz-Torres and Mahmoodi [13], it is one of the corresponding to expressions 1 to 4 with dashed or dotted
most commonly used methods for the safety stock calculation. lines, and the remaining four approaches with continuous
lines.
Fig. 2. Evolution of the stock level using different safety stock approaches.
Regarding this later set, it can be seen that approach A In sum, obtaining a method or an expression to calculate the
results in a high stock level, which is related with the nature of optimum safety stock is a very complex task, as corroborated
the approach, starting the simulation with all the quantity of by Ruiz-Torres and Mahmoodi [13] and Schmidt et al. [14].
materials that will be consumed throughout the year, already in Thus, with all the pros and cons above discussed, it is certainly
stock. Conversely, approach B is the result of the difference an arguable decision, however approach B and C and
between all consumptions and all arrivals. However, as the expression 4 can be emphasized. The former resulted in the
graph shows, the stock indeed decreased, albeit with the cost second lowest unfilled orders percentage, albeit approach A
of some unfilled orders (2%), which can be justified by the cannot be selected for disruption scenarios, since it would
arrival of some materials later than expected (volatile demand never result in unfilled orders, as it starts with the exact stock
or lead time). Approach C shows that it is not enough to required during the simulation. In its turn, expression 4 is one
consider the quantity consumed until the first arrival, as the of the most adopted calculation methods in literature [13], [14]
unfilled orders considerably increases, in comparison to the and resulted in less unfilled percentage than the remaining
previous approaches. Lastly, the graph also includes a calculation methods. Hence, as the analysis suggests, there is
scenario without initial stock with 59% of unfilled orders. no solution unarguably better than the others.
In their turn, expressions 1 to 3 returned considerably lower
initial stock, albeit with high unfilled orders percentage
(respectively 44%, 44% and 45%). It is interesting to note that 4.8. Production time, capacity and utilization
all approaches, except for expression 4 and approach A, tend
to the same stock level, although all, except approach B, The simulation model must consider the production
obtained the highest percentage of unfilled orders. capacity of the plant, albeit it is hard to obtain such
1
metric. Hence,
António A.C. Vieira et al. / Procedia Manufacturing 42 (2020) 132–
António A.C. Vieira et al. / Procedia Manufacturing 42 (2020) 132–139 13
9
simulation was used to estimate it. The plant’s production is not available. Thus, with the help of process experts and some
divided in two Departments, dedicated to different production field observations to measure average production times, a
phases: automatic insertion and final assembly. Thus, the generic normal distribution was applied to all materials. Note
number of capacity units of these production units was set to that the customers’ orders were replaced by Production orders
infinity and the results were plot in Fig. 3. Nevertheless, (as previously discussed in this section), which reduced the
besides recording the number of units in the production, it was scope of the SC system. Thus, a considerable impact of this
also necessary to establish a production time, which was also expression in the performance of the system is not expected.
As the figure shows, the maximum capacity of both to bypass such issues, which was done in an iterative way.
production Departments can be determined. This is the Next section shows the types of results that can be obtained
required number of capacity units in order to fulfill all the from the simulation model, after bypassing the dissed data
orders registered in the data. The figure also shows the issues.
required capacity units for the overall production is 240.
The data issues discussed in this section, allowed to 5. Results
understand that, despite the Big Data that organizations
already have, it is arguable if their data models are In this section, the main results that can be retrieved form
complete and consistent. In fact, this section showed that in the developed and coherent SC simulation model are
the plant considered in this case study this is not the case. addressed. In this regard, Fig. 4 shows a picture of the model
Hence, the solution was to apply the approaches described in during a simulation run.
this section
The model runs in a 3D world map view. The figure also these entities represents the location of the supplier. The
shows some circles placed at north of Europe. The location of
1 António A.C. Vieira et al. / Procedia Manufacturing 42 (2020) 132–
number presented under each yellow entity is the number
of
António A.C. Vieira et al. / Procedia Manufacturing 42 (2020) 132–139 14
1
days remaining for the order to be shipped to the plant. This deliveries were shipped to the plant. Apart from graphical
number decreases as the simulation clock advances in time. results, it is also possible to retrieve analytical results from the
When it is time to ship the order, the symbol of the orders tool, with Fig. 5 showing the total quantity of materials
change to the respective transport type, with the figure ordered, consumed and arrived to the plant during the
showing some of these entities highlighted. The date time years of data stored in the BDW.
values associated to each entity represent the instant
when those
Fig. 5. Total quantity of materials ordered, received and consumed per week.
The adopted approach allowed to achieve a simulation assessment of the data quality to a syntactic level (e.g., null
model that is coherent and consistent with the system being value verification), which is not enough for simulation, where
modelled, in the sense that the main elements stored in the this verification needs to be taken to a different level of
BDW are reflected in the simulations. Hence, managers from exigency. In fact, in simulation there is an obligation to
the plant can use such tool to aid them in the analysis of integrate data in such way that it must originate a coherent
uncertain and alternative scenarios. Nevertheless, the achieved simulation model (in order to accurately mimic a process, all
results also show that simulation can be used as a data its elements must be present and coherent). In this work, the
validation technique, further extending traditional data authors argue that simulation can be used as a semantic
profiling ones. In fact, simulation allowed data issues to be validator of the data model, advancing traditional data
identified, by evaluating the semantics of data, and also profiling techniques, in the sense that it allowed additional
allowed certain missing data to be estimated. data issues to be identified and missing data to be estimated.
The identified issues and the respective approaches that
6. Conclusions were implemented to bypass them, allowed to better
understand both the data sources and the associated
SC systems generate huge amounts of data, due to the business processes, hence helping in the development of the
several data sources that are used to manage the associated simulation model. In fact, the obtained results (both graphical
business processes. Furthermore, SCs are complex systems, and numerical) were the result of bypassing the identified
being useful to use both Big Data and Simulation to model SC issues, while still maintaining the overall coherence of the
problems. With these, it would be possible to test uncertainty model.
scenarios using simulation, as well as to consider the detail Despite the huge amounts of available data (around 3 GB of
provided by Big Data. In this paper, an industrial project using data), this work showed that the data model of organizations is
real data from an automotive electronics SC was presented, still incomplete, in the sense that it still does not allow
which is associated to a plant of the automotive electronics complete mimics of their SC systems to be reproduced. This
industry sector. In such highly dynamic environments, it is suggests that, despite using many software packages,
common for data issues to be verified. Thus, this paper aimed spreadsheets, IS and others, organizations are still lacking
to present the most relevant data issues that were faced while data that is relevant, in order to allow the creation of accurate
developing the SC simulation model in a Big Data context, simulations of their SCs. Such issues included data sources
while also discussing their impact on the solution and the which could not be obtained and data that did not reflect a
measures that were taken to bypass them. given business strategy followed at the plant, indicating that
Indeed, some data issues can be handled by traditional data the data was incomplete, or not registered in the correct order
profiling techniques. However, such techniques only allow the or with the correct date. Some of these issues may be related
1 António A.C. Vieira et al. / Procedia Manufacturing 42 (2020) 132–
with the top management view that often disregards the
existence of low-level data (e.g.,
António A.C. Vieira et al. / Procedia Manufacturing 42 (2020) 132–139 14
3
material movements), which is necessary in order to produce a [3] Costa E, Costa C, Santos M. Evaluating partitioning and bucketing
coherent simulation model. Notwithstanding, this barrier strategies for Hive-based Big Data Warehousing systems. Journal of Big
should be bypassed when the Industry 4.0 revolution is Data; 2019. 6, 1 (Dec. 2019), 34.
completely materialized, which will allow some of this data to [4] Vieira AC, Pedro L, Santos MY, Fernandes JM, Dias LS. Data
Requirements Elicitation in Big Data Warehousing. European,
be automatically generated, stored and integrated – without Mediterranean, and Middle Eastern Conference on Information Systems,
eventual errors related with manual interactions - to allow EMCIS, Lecture Notes in Business Information Processing; 2019. 106–
analytical methods (e.g., simulation) to be employed. 113.
In terms of future work, the following directions are [5] Vieira AC, Dias LS, Santos MY, Pereira GB, Oliveira JA. Simulation of
highlighted. In what concerns the issue of missing historical an Automotive Supply Chain in Simio: Data Model Validation. 30th
European Modeling and Simulation Symposium, EMSS; 2018. 294–301.
data, the BDW can be used to maintain it, however, these will [6] Bokrantz J, Skoogh A, Lämkull D, Hanna A, Perera T. Data quality
only be accessible in the mid- to long-term. The remaining problems in discrete event simulation of manufacturing operations.
missing data sources have to be covered with solutions aligned Simulation; 2018. 94, 11 (Nov. 2018), 1009–1025.
with the organization. Furthermore, despite the identified data [7] Kagermann H, Helbig J, Hellinger A, Wahlster. Recommendations for
issues, this paper also showed that it is possible to retrieve Implementing the Strategic Initiative INDUSTRIE 4.0: Securing the
Future of German Manufacturing Industry ; Final Report of the Industrie
results from a coherent simulation model, hence allowing 4.0 Working Group. Forschungsunion. 2013.
several types of SC risks to be analyzed. Thus, future work [8] Vieira AC, Dias LS, Santos MY, Pereira GB, Oliveira JA. Setting an
should also concern in performing such risks analysis. industry 4.0 research and development agenda for simulation – A
literature review. International Journal of Simulation Modelling; 2018.
Acknowledgements 17, 3, 377–390.
[9] Costa E, Costa C, Santos M. Efficient big data modelling and
organization for hadoop hive-based data warehouses. European,
This work has been supported by national funds through Mediterranean, and Middle Eastern Conference on Information Systems,
FCT – Fundação para a Ciência e Tecnologia within the EMCIS, Lecture Notes in Business Information Processing; 2017. 3–16.
Project Scope: UID/CEC/00319/2019 and by the Doctoral [10] Santos MY, Costa C. Data Models in NoSQL Databases for Big Data
Contexts. International Conference of Data Mining and Big Data,
scholarship PDE/BDE/114566/2016 funded by FCT, the
Lecture Notes in Computer Science (including subseries Lecture Notes
Portuguese Ministry of Science, Technology and Higher in Artificial Intelligence and Lecture Notes in Bioinformatics); 2016.
Education, through national funds, and co-financed by the 475–485.
European Social Fund (ESF) through the Operational [11] Vieira AC, Dias LS, Santos MY, Pereira GB, Oliveira JA. Supply chain
Programme for Human Capital (POCH). hybrid simulation: From Big Data to distributions and approaches
comparison. Simulation Modelling Practice and Theory; 2019. 97, (Dec.
2019), 101956.
References [12] Vieira AC, Dias LS, Pereira GB, Oliveira J, Carvalho MC, Martins P.
Automatic simulation models generation of warehouses with milk runs
[1] Tiwari S, Wee HM, Daryanto Y. Big data analytics in supply chain and pickers. 28th European Modeling and Simulation Symposium; 2016.
management between 2010 and 2016: Insights to industries. Computers 231–241.
& Industrial Engineering; 2018. 115, (Jan. 2018), 319–330. [13] Ruiz-Torres AJ, Mahmoodi F. Safety stock determination based on
[2] Zhong RY, Newman ST, Huang GQ, Lan S. Big Data for supply chain parametric lead time and demand information. International Journal of
management in the service and manufacturing sectors: Challenges, Production Research; 2010. 48, 10, 2841–2857.
opportunities, and future perspectives. Computers and Industrial [14] Schmidt M, Hartmann W, Nyhuis P. Simulation based comparison of
Engineering; 2016. 101, 572–591. safety-stock calculation methods. CIRP Annals - Manufacturing
Technology; 2012. 61, 1, 403–406.