(FreeCourseWeb - Com) 3030494179
(FreeCourseWeb - Com) 3030494179
Iris Reinhartz-Berger
Pnina Soffer
Jelena Zdravkovic (Eds.)
Enterprise, Business-Process
LNBIP 387
123
Lecture Notes
in Business Information Processing 387
Series Editors
Wil van der Aalst
RWTH Aachen University, Aachen, Germany
John Mylopoulos
University of Trento, Trento, Italy
Michael Rosemann
Queensland University of Technology, Brisbane, QLD, Australia
Michael J. Shaw
University of Illinois, Urbana-Champaign, IL, USA
Clemens Szyperski
Microsoft Research, Redmond, WA, USA
More information about this series at http://www.springer.com/series/7911
Selmin Nurcan Iris Reinhartz-Berger
• •
Enterprise, Business-Process
and Information Systems
Modeling
21st International Conference, BPMDS 2020
25th International Conference, EMMSAD 2020
Held at CAiSE 2020, Grenoble, France, June 8–9, 2020
Proceedings
123
Editors
Selmin Nurcan Iris Reinhartz-Berger
University Paris 1 University of Haifa
Paris, France Haifa, Israel
Pnina Soffer Jelena Zdravkovic
University of Haifa Stockholm University
Haifa, Israel Kista, Sweden
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This book contains the proceedings of two long-running events held along with the
CAiSE conference relating to the areas of enterprise, business-process and information
systems modeling: the 21st International Conference on Business Process Modeling,
Development and Support (BPMDS 2020) and the 25th International Conference on
Exploring Modeling Methods for Systems Analysis and Development (EMMSAD
2020). The two working conferences had a joint keynote on “Automated Process
Improvement: Status, Challenges, and Perspectives” given by Marlon Dumas, Pro-
fessor of Information Systems at Institute of Computer Science, University of Tartu.
More information on the individual events and their selection processes can be found
below.
BPMDS 2020
BPMDS has been held as a series of workshops devoted to business process modeling,
development, and support since 1998. During this period, business process analysis and
design has been recognized as a central issue in the area of information systems
(IS) engineering. The continued interest in these topics on behalf of the IS community
is reflected by the success of the last BPMDS events and the recent emergence of new
conferences and workshops devoted to the theme. In 2011, BPMDS became a two-day
working conference attached to CAiSE (Conference on Advanced Information Systems
Engineering). The goals, format, and history of BPMDS can be found on the website
http://www.bpmds.org/.
In 2020, BPMDS took place virtually as an online event, keeping the general spirit
and principles of BPMDS.
The intention of BPMDS is to solicit papers related to business process modeling,
development, and support (BPMDS) in general, using quality as a main selection
criterion. As a working conference, we aim to attract papers describing mature
research, but we still give place to industrial reports and visionary idea papers. To
encourage new and emerging challenges and research directions in the area of business
process modeling, development, and support, we have a unique focus theme every
year. Papers submitted as idea papers are required to be of relevance to the focus theme,
thus providing a mass of new ideas around a relatively narrow but emerging research
area. Full research papers and experience reports do not necessarily need to be directly
connected to this theme (they still needed to be explicitly relevant to BPMDS though).
The focus theme for BPMDS 2020 idea papers was “BPM meets data.” For the 21st
edition of the BPMDS conference, we invited the interested authors to address, through
their idea papers and the online discussions held during the two days of BPMDS 2020,
vi Preface
issues related to the intersection between business processes and data. These relations
can be viewed along the business process life-cycle:
– Designing and modeling data-aware processes
– Integrating and incorporating different kinds and sources of data in process exe-
cution environments (IoT, blockchain, network traffic)
– Monitoring, assessing performance and conformance, and predicting the outcomes
of running processes using the data they generate
– Creating process models from various sources of data through process discovery
BPMDS 2020 received 30 submissions from 19 countries (Australia, Austria, Brazil,
Canada, Estonia, France, Germany, Iraq, Italy, Paraguay, Portugal, Slovenia, South
Africa, Spain, Switzerland, Tunisia, Ukraine, Uruguay, and the USA). The manage-
ment of paper submission and reviews was supported by the EasyChair conference
system. Each paper received at least three reviews. Eventually, 13 high-quality full
papers and 1 short paper were selected. These include two idea papers and one
experience report.
The accepted papers cover a wide spectrum of issues related to business process
modeling, development, and support. They are organized under the following section
headings:
• Business process execution and monitoring
• BPM applications in industry and practice
• Planning and scheduling in business processes
• Process mining
• Process models and visualizations
We wish to thank all the people who submitted papers to BPMDS 2020 for having
shared their work with us, as well as the members of the BPMDS 2020 Program
Committee, who made a remarkable effort in reviewing submissions.
We also thank the organizers of CAiSE 2020 for their help with the organization
of the event, particularly adjusting to the changing circumstances during the global
crisis and facilitating the transformation to a virtual event. We also thank IFIP WG8.1
for the support, and Springer and OCS for their kind assistance for the production
of the proceedings.
The objective of the EMMSAD conference series is to provide a forum for researchers
and practitioners interested in modeling methods for Systems Analysis and Develop-
ment (SA&D) to meet and exchange research ideas and results. The conference aims to
provide home for a rich heritage of modeling paradigms, including software modeling,
business process modeling, enterprise modeling, capability modeling, ontology mod-
eling, and domain-specific modeling. These important paradigms, and specific methods
following them, continue to be enriched with extensions, refinements, and even new
languages, to address new challenges. Even with some attempts to standardize, new
modeling methods are constantly being introduced, especially in order to deal with
emerging trends and challenges. Ongoing changes significantly impact the way systems
are being analyzed and designed in practice. Moreover, they challenge the empirical
and analytical evaluation of the modeling methods, which aims to contribute to the
knowledge and understanding of their strengths and weaknesses. This knowledge may
guide researchers towards the development of the next generation of modeling methods
and help practitioners select the modeling methods most appropriate to their needs.
This year, EMMSAD 2020 took a virtual form. We continued with the five tracks
which emphasize the variety of EMMSAD topics:
1. Foundations of modeling & method engineering – chaired by Jolita Ralyté and Janis
Stirna
2. Enterprise, business process & capability modeling – chaired by Jãnis Grabis and
Paul Grefen
3. Information systems & requirements modeling – chaired by Oscar Pastor and
Marcela Ruiz
4. Domain-specific & ontology modeling – chaired by Dimitris Karagiannis and
Arnon Sturm
5. Evaluation of modeling approaches – chaired by Agnes Koschmider and Geert
Poels
More details can be found at http://www.emmsad.org/.
29 submissions were received from 20 countries (Algeria, Austria, Bosnia and
Herzegovina, Brazil, Canada, China, Estonia, France, Israel, Germany, Latvia, The
Netherlands, Pakistan, Portugal, South Africa, Spain, Sweden, Switzerland, Turkey,
and the USA). The division of submissions among tracks was as follows (a single paper
could be categorized into multi tracks): 5 submissions related to foundations of
modeling & method engineering; 10 to enterprise, business process & capability
modeling; 15 to information systems & requirements modeling; to domain-specific &
ontology modeling; and 5 to evaluation of modeling approaches. After a rigorous
review process, which included 3 or 4 reviews per submission, 15 high-quality (11 long
and 4 short) papers were selected. They were divided into 5 sections as follows:
viii EMMSAD 2020
• Rick Gilsing, Anna Wilbik, Paul Grefen, Oktay Turetken, and Baris Ozkan: “A
Formal Basis for Business Model Evaluation with Linguistic Summaries”
We wish to thank EMMSAD 2020 authors for having shared their work with us, as
well as the members of EMMSAD 2020 Program Committee for their valuable reviews
in the difficult times of COVID-19 epidemic. Special thanks go to the track chairs for
their help in EMMSAD advertising and submission attraction. Finally, we thank the
organizers of CAiSE 2020 for their help with the organization of the event, IFIP
WG8.1 for its support, and Springer staff (especially Ralf Gerstner, Christine Reiss, and
OCS support).
Program Chairs
Selmin Nurcan Université Paris 1 Panthéon Sorbonne, France
Pnina Soffer University of Haifa, Israel
Steering Committee
Ilia Bider Stockholm University, IbisSoft, Sweden
Selmin Nurcan Université Paris 1 Panthéon Sorbonne, France
Rainer Schmidt Munich University of Applied Sciences, Germany
Pnina Soffer University of Haifa, Israel
Program Committee
João Paulo A. Almeida Federal University of Espírito Santo, Brazil
Eric Andonoff Université Toulouse 1, France
Saïd Assar Institut-Mines, Télécom Business School, France
Judith Barrios Albornoz University de Los Andes, Venezuela
Ilia Bider Stockholm University, IbisSoft, Sweden
Karsten Boehm FH KufsteinTirol, Austria
Cristina Cabanillas Vienna University of Economics and Business, Austria
Claudio di Ciccio Vienna University of Economics and Business, Austria
Dirk Fahland Eindhoven University of Technology, The Netherlands
Claude Godard Université de Lorraine, France
Renata Guizzardi Federal University of Espírito Santo, Brazil
Jens Gulden Utrecht University, The Netherlands
Amin Jalali Stockholm University, Sweden
Marite Kirikova Riga Technical University, Latvia
Agnes Koschmider Kiel University, Germany
Henrik Leopold Kühne Logistics University, Germany
Jan Mendling Vienna University of Economics and Business, Austria
Michael Möhring Munich University of Applied Sciences, Germany
Pascal Negros Arch4IE, France
Oscar Pastor Universitat Polytechnica de Valencia, Spain
Gil Regev EPFL, Itecor, Switzerland
xii BPMDS 2020 Organization
Additional Reviewers
Inge van de Weerd Utrecht University, The Netherlands
Jan Martijn van der Werf Utrecht University, The Netherlands
Dominik Janssen Kiel University, Germany
Sebastian Steinau Ulm University, Germany
EMMSAD 2020 Organization
Program Chairs
Iris Reinhartz-Berger University of Haifa, Israel
Jelena Zdravkovic Stockholm University, Sweden
Track Chairs
Janis Grabis Riga Technical University, Latvia
Paul Grefen Eindhoven University of Technology, The Netherlands
Dimitris Karagiannis University of Vienna, Austria
Agnes Koschmider Kiel University, Germany
Oscar Pastor Lopez Universitat Politècnica de València, Spain
Geert Poels Ghent University, Belgium
Jolita Ralyté University of Geneva, Switzerland
Marcela Ruiz Zurich University of Applied Sciences, Switzerland
Janis Stirna Stockholm University, Sweden
Arnon Sturm Ben-Gurion University, Israel
Program Committee
Giuseppe Berio Université de Bretagne Sud, France
Ghassan Beydoun University of Technology Sydney, Australia
Dominik Bork University of Vienna, Austria
Drazen Brdjanin University of Banja Luka, Bosnia and Herzegovina
Tony Clark Aston University, UK
Nelly Condori-Fernández Universidade da Coruña, Spain
Sergio de Cesare University of Westminster, UK
Mahdi Fahmideh University of Wollongong, Australia
Michael Fellmann University of Rostock, Germany
Christophe Feltus Luxembourg Institute of Science and Technology,
Luxembourg
Peter Fettke Saarland University, Germany
Hans-Georg Fill University of Fribourg, France
Ulrich Frank Universität Duisburg-Essen, Germany
Frederik Gailly Ghent University, Belgium
Mohamad Gharib University of Florence, Italy
Asif Qumer Gill University of Technology Sydney, Australia
Cesar Gonzalez-Perez Spanish National Research Council, Spain
Sérgio Guerreiro University of Lisbon, Portugal
Renata Guizzardi Universidade Federal do Espirito Santo, Brazil
Martin Henkel Stockholm University, Sweden
xiv EMMSAD 2020 Organization
Additional Reviewers
Michael Poppe University of Rostock, Germany
Fabienne Lambusch University of Rostock, Germany
Felix Härer University of Fribourg, France
Florian Johannsen University of Regensburg, Germany
Automated Process Improvement:
Status, Challenges, and Perspectives
(Keynote Abstract)
Marlon Dumas
1 Introduction
The context in which a process is executed determines essential factors at run-
time. These range from trivial ones, e.g. whether or not a process may be
executed, to complex factors, such as the selection of sub-process variants at
runtime. Making business process management systems (BPMS) context-aware
increases the flexibility of processes they execute by supporting business rules
that are enforced based on the context [1]. Informally, process context is defined
as “the minimum set of variables containing all relevant information that impacts
the design and execution of a business process” [2], which emphasizes the impor-
tance of context for process execution. However, contemporary BPMS do not
allow changing the execution context of running process instances, even though
this would increase flexibility. Consider a recruitment process from the HR
domain, in which applicants apply for a job offer, as an example in which flexibil-
ity could be increased by allowing for process context switches at runtime. More
specifically, the context of a job application process corresponds to the job offer
an applicant applies to, as well as meta information such as the department the
respective job is allocated to. Furthermore, consider an unsolicited application.
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 3–19, 2020.
https://doi.org/10.1007/978-3-030-49418-6_1
4 K. Andrews et al.
In the first case, while the context of the application process seems to be clear
at the beginning, during the course of a job interview, it might be decided that
the applicant would better fit a different job at another department. In the sec-
ond case, parts of the context, such as a concrete job offer, are missing entirely
and can only be determined after the process starts. Although both cases can
be partially handled in an activity-centric BPMS by adding gateways and loops
into the process model, this would make the process model unnecessarily com-
plex. Therefore, most companies handle cases like these by forcing applicants to
resubmit their application to a different job offer, which, in future, might cause
confusion due to multiple applications from the same person.
This paper presents solutions to these issues for object-aware BPM, a data-
centric process management paradigm, by enabling dynamic process context
switches without requiring process model changes. The paper builds upon previ-
ous work that led to the development of the PHILharmonicFlows process engine
and contributes fundamental research into the notion of process context in data-
centric BPM paradigms. The fundamentals of object-aware BPM are explained
in Sect. 2. The notion of process context in object-aware processes is examined
in Sect. 3. The concepts and algorithms for enabling context switching are pre-
sented in Sect. 4. An overview of our prototype implementation is given in Sect. 5.
Section 6 discusses related work and Sect. 7 summarizes the paper.
2 Backgrounds
The PHILharmonicFlows implementation of object-aware BPM, a data-centric
BPM paradigm, has been under development for many years and serves as a test-
bed for the concepts presented in this paper [3,4]. PHILharmonicFlows takes
the idea of a data-driven BPMS and enhances it with the concept of objects. An
object describes the structure of its contained data and process logic at design-
time whereas an object instance holds concrete data values and executes the
process logic at runtime. This may be compared to the concept of a table and
its rows in a relational database. For each business object present in a real-
world business process one such object exists. We further examine the concept
of objects utilizing an Application object from the HR domain. As can be seen
in Fig. 1, the object consists of data, in the form of attributes, and a state-based
process model describing the data-driven object lifecycle.
As object-aware BPM is data-driven, the lifecycle execution of an instance
of the Application object is as follows: The initial state is Created. Once an
Applicant has entered data for attributes Job Offer, Applicant, and CV, he or
she may trigger the transition to the Sent state. This causes the Application to
change its state to Sent, in which it waits until the reviewing period is over,
Dynamically Switching Execution Context in Data-Centric BPM Approaches 5
Object
Application State
Accepted
Created
Step Sent Checked
Accepted
Job Offer Applicant CV
Accepted == True
CV_ApplicantA.pdf
Employee Person
allow for more complex business processes, many dif-
ferent objects and users may have to be involved
1:n
Job Offer
1:n
Application
cess is the data model, an example of which can be
seen in Fig. 3, with objects representing users, e.g.
1:n
1:n
This section examines the concepts we developed for enabling process context
switching. To reiterate, an object-aware process instance consists of a data
model instance that comprises many object instances. Further, there are relation
instances between associated object instances. Finally, the coordination process
Dynamically Switching Execution Context in Data-Centric BPM Approaches 7
Review Self
Review Application Interview
Bottom-Up
Prepared Reject Proposed Checked Reject Proposed
instance monitors the object instances and coordinates their execution. Con-
sequently, in an object-aware process instance, many different processes, such
as lifecycle processes and the coordination process, are executed concurrently.
As one can not simply determine a single process context for this collection of
largely independent processes, this section presents four points of view, or scopes,
one may use to examine the combined process context.
The lifecycle process is currently in state Rejected, as all attribute values from
the previous states have already been written. This includes attribute Accepted
with its value False, which forced the corresponding decision step in Checked to
trigger the transition to Rejected. In turn, this led to the current state of the
8 K. Andrews et al.
object instance being Rejected. Note that these attribute values always lead to
the same state if the lifecycle process is re-executed. In summary, if the scope
of the process context is limited to a lifecycle process, the context will solely
consist of attribute values. However, this scope is too limited for most purposes.
API
API
Programmer
Programmer
(Job Offer)
(Job Offer)
Transitive Relation
[email protected] –
[email protected] – [email protected] – Programmer
Programmer Programmer (Application)
(Application) (Application)
Transverse Relation
[email protected] – [email protected] –
[email protected] Programmer
[email protected] – [email protected] – [email protected] – (Interview)
[email protected] [email protected] [email protected]
(Review) (Review) (Review)
In Sect. 3, we presented the scopes one has to consider when determining what
constitutes process context in an object-aware process. There is no simple way
of taking a single object instance or other conceptual element and determining
its process context in a general fashion, as, when including all constraints and
relations, the process context of a single object instance consists of all object
instances present in a data model instance. Without additional concepts, there
is no way to remove an object instance from its process context and re-insert
Dynamically Switching Execution Context in Data-Centric BPM Approaches 11
it into another, as this would also change the context of other object instances,
causing inconsistencies. This section presents concepts to enable changing or
switching only parts of the process context of one or more object instances. We
facilitate this with (a) the help of algorithms that perform the actual context
changes, and (b) the inherent execution flexibility of object-aware BPM, which
allows fixing inconsistent processes at runtime with dynamically generated forms.
The basic building block for enabling process context changes in object-aware
process management is to enable context changes at the smallest scope possible,
i.e., the process context of a lifecycle process (cf. Sect. 3.1). To reiterate, the
process context of the lifecycle process being executed in an object instance cor-
responds to the supplied attribute values. As the lifecycle process is data-driven,
its execution is advanced when certain data becomes available. The context of
a lifecycle process, therefore, changes continuously, which drives process execu-
tion. This data-driven approach allows for the re-execution of a lifecycle process
instance based on a replay algorithm. To be more precise, the data-driven nature
of lifecycle processes ensures that the lifecycle process is re-executed in an iden-
tical fashion if the attribute values, i.e., the process context, remains unchanged.
However, as we want to be able to change attribute values and then re-execute
the lifecycle process, we extended the algorithm for re-executing a lifecycle pro-
cess instance with the ability to alter attribute values (cf. Algorithm 1).
Note that it is not necessary to allow users to trigger this kind of process
context change, as it is merely considered a building block for the higher-level
user-facing context changes. Algorithm 1 is essential as it allows the lifecycle
process to be re-executed when the object instance it belongs to switches its
process context.
12 K. Andrews et al.
advanced or reverted into the appropriate states according to the process context
changes, they were impacted by, process consistency is restored. Note that a
change to the context of one object instance might have a cascading impact on
others, requiring a re-execution or lifecycle advancement to restore consistency.
API C++
Programmer Programmer
(Job Offer) (Job Offer)
[email protected] – [email protected] –
[email protected] –
Programmer Programmer
Programmer (Application)
(Application)
[email protected] –
[email protected] – [email protected] – [email protected] –
[email protected] – Programmer
[email protected]
[email protected] – [email protected] Programmer
[email protected] (Interview)
[email protected] (Review) (Interview)
(Review)
While the process context changes in the previous examples were rather small
in scope, e.g., consisting of the replacement of a single relation to a parent object
instance by another, the above change is conducted in the scope of multiple
object instances at the same time. Ensuring consistency before the change would
require the user to determine replacement relations for all relation instances to
objects not existing in the new data model instance. As an example, consider the
relation between one of the Review object instances and the Employee assigned
as a reviewer, e.g. [email protected] (cf. Fig. 11). In the new data model instance (i.e.
the other department), [email protected] does not exist, causing the relation instance
Dynamically Switching Execution Context in Data-Centric BPM Approaches 15
between the Review and the Employee to be deleted. One way to solve this is to
require the determination of replacement relations for each deleted relation, as
previously suggested. Instead, once again, we leverage the flexible (re-)execution
supported by object-aware lifecycle processes to elegantly solve this problem.
To be precise, we delete the relations to all objects not present in the new
data model instance. Furthermore, we delete the attribute values referencing
the relations, e.g. the Job Offer attribute in the Application object (cf. Fig. 6).
Moreover, due to the presence of the Job Offer relation attribute as a step in
the lifecycle process, an instance of the Application object must not progress
past state Created without a value for Job Offer being provided. Coincidentally,
a value for an attribute with the data type “relation” is provided by creating
a relation to another object and vice versa. However, deleting the value of an
attribute, once execution has progressed past the state it is required in, causes
a lifecycle inconsistency. For example this happens when the Applications and
Reviews are moved between the two data model instances, causing the relations
to the no longer existing Employees and Job Offers to be deleted. If we trigger
a re-execution of the lifecycle process instance of all object instances with now
deleted relation instances (cf. Algorithm 1), the data-driven lifecycle process
reacts by executing, for example, the Application object instance until the end
of the Created state, and then waiting for user input.
Here, the dynamic form generation capabili-
Application – Created
ties (cf. Sect. 2) are utilized. After changing the
Job Offer*
Applicant [email protected]
process context, the Application is missing a Job
CV CV_ApplicantA.pdf Offer. The form shown in Fig. 12 is generated and
Accepted added to the worklist of a personnel officer, allowing
him to select the C++ Programmer. Once the Job
Fig. 12. Application after Offer is selected, the data-driven lifecycle execution
context switch advances the Application object to its previous state.
Similar forms are generated for both Reviews, and
once all three forms are completed, process consistency is restored.
1
Feel free to log in to the live instance at https://phoodle.dbis.info
Username: [email protected] Password: edoc.demo.
2
https://www.youtube.com/watch?v=oGKjK7K76Ck.
Dynamically Switching Execution Context in Data-Centric BPM Approaches 17
6 Related Work
LateVa [6], enables automated late selection of process fragments based on pro-
cess context. In essence, a process model does not define all possible variations
but contains variation points that are replaced with process fragments at run-
time depending on the process context. The actual replacement is done by the
“fragment recommender”, based on data mined from historical process instances.
The inclusion of “pockets of flexibility” into workflow models is proposed
in [7]. Each pocket contains multiple individual process fragments that can be
rearranged at runtime according to the needs of the process context, allowing
for greater flexibility at certain points. CaPI (context-aware process injection)
[8] analyses process context and allows injecting process fragments into exten-
sion areas. These fragments and the context in which they may be injected are
determined at design-time using a sophisticated modeling tool.
The approaches presented in [6–8] follow a similar approach, limiting process
context flexibility to predefined regions of a process model. Our approach aims
to remove this limitation through relaxation. Instead of defining regions in which
flexibility is possible, we allow for context changes except for in some situations.
Controlled evolution of process choreographies are examined in [9]. A process
choreography describes the interactions of business processes with partner pro-
cesses in a cross-organizational setting. [9] examines ways to gauge the impact
of changes to processes with respect to their partner processes. This is a similar
problem to the one examined in this paper, determining how to understand the
impact that process context changes have on other object instances that are part
of the same data model instance.
The concept of batch regions is introduced in [10]. A batch region is a part of
a process model that may be executed in a single batch if there are other process
instances available corresponding to the same context. Similar capabilities may
be introduced to object-aware process management by extending the context
switching concepts detailed in this paper to aggregate different object instances
with similar contexts and executing them in batches.
Finally, [11] presents the context-oriented programming (COP) paradigm,
which introduces a number of interesting aspects that could be incorporated
into our future research. Combining our contribution with COP, which allows
for objects in a programming language to behave differently depending on the
context they are executed in, would be an interesting research direction. COP
introduces layering for grouping behavioral variants of code with selectors that
choose the correct variant after a context switch occurs at runtime. Similar
notions could be used to extend the research presented in this paper.
18 K. Andrews et al.
Acknowledgments. This work is part of the ZAFH Intralogistik, funded by the Euro-
pean Regional Development Fund and the Ministry of Science, Research and the Arts
of Baden-Wuerttemberg, Germany (F.No. 32-7545.24-17/3/1).
References
1. Saidani, O., Nurcan, S.: Context-awareness for business process modelling. In: 3rd
International Conference on Research Challenges in Information Science, pp. 177–
186. IEEE (2009)
2. Rosemann, M., Recker, J.C.: Context-aware process design. In: 18th International
Conference on Advanced Information Systems Engineering (CAiSE) Workshops,
pp. 149–158 (2006)
3. Steinau, S., Andrews, K., Reichert, M.: The relational process structure. In:
Krogstie, J., Reijers, H.A. (eds.) CAiSE 2018. LNCS, vol. 10816, pp. 53–67.
Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91563-0 4
4. Andrews, K., Steinau, S., Reichert, M.: Enabling runtime flexibility in data-centric
and data-driven process execution engines. Inf. Syst. (2019)
5. Müller, D., Reichert, M., Herbst, J.: Flexibility of data-driven process structures.
In: Eder, J., Dustdar, S. (eds.) BPM 2006. LNCS, vol. 4103, pp. 181–192. Springer,
Heidelberg (2006). https://doi.org/10.1007/11837862 19
6. Murguzur, A., Sagardui, G., Intxausti, K., Trujillo, S.: Process variability through
automated late selection of fragments. In: Franch, X., Soffer, P. (eds.) CAiSE 2013.
LNBIP, vol. 148, pp. 371–385. Springer, Heidelberg (2013). https://doi.org/10.
1007/978-3-642-38490-5 35
7. Sadiq, S., Sadiq, W., Orlowska, M.: Pockets of flexibility in workflow specification.
In: S.Kunii, H., Jajodia, S., Sølvberg, A. (eds.) ER 2001. LNCS, vol. 2224, pp.
513–526. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45581-7 38
8. Mundbrod, N., Grambow, G., Kolb, J., Reichert, M.: Context-aware process injec-
tion. In: Debruyne, C., et al. (eds.) OTM 2015. LNCS, vol. 9415, pp. 127–145.
Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26148-5 8
Dynamically Switching Execution Context in Data-Centric BPM Approaches 19
9. Rinderle, S., Wombacher, A., Reichert, M.: On the controlled evolution of pro-
cess choreographies. In: 22nd International Conference on Data Engineering, ICDE
2006, p. 124. IEEE (2006)
10. Pufahl, L., Meyer, A., Weske, M.: Batch regions: process instance synchronization
based on data. In: 18th International Enterprise Distributed Object Computing
Conference, pp. 150–159. IEEE (2014)
11. Hirschfeld, R., Costanza, P., Nierstrasz, O.M.: Context-oriented programming. J.
Object Technol. 7(3), 125–151 (2008)
Exception Handling in the Context
of Fragment-Based Case Management
1 Introduction
1
https://github.com/bptlab/chimera.
2
http://smile-project.de/smile-projekt/.
22 K. Andree et al.
In Fig. 1(a), we have visualized four fragments, in which the last mile pro-
cess is split. The fragments are either triggered by external events (shown as
message events) or by certain available data. The first fragment is triggered by
an event called New Parcel registered, which is sent by the sender of the parcel.
For registered parcels, the recipients are notified and asked when they want to
receive their parcel. This is what happens in the activity Collect Parcel Data,
which transfers the data object Parcel from its state registered to enriched. As
soon as the information is received that the parcel has arrived at the depot (i.e.
the pick-up place, e.g., a gas station or a greengrocer just around the corner),
the parcel can be planned (cf. Fig. 1(b)), if it is enriched, into a delivery tour for
a local carrier. The activity Plan Tour calls a complex planning service. This
service collects at a certain point in time all parcels requesting a planning with
the same postal code and looks for a suitable carrier who delivers in this area.
It plans a route which ensures that each parcel will be delivered in its specific
time slot. If a planned tour is allocated to a carrier, the parcel can be collected
at the depot by the carrier. Alternatively, recipients can pick up their parcel in
person as soon as the parcel has arrived at the depot but was not yet planned
for a tour. Then, the third fragment Fig. 1(c) is executed instead of fragment
2 and 4. The process ends when the parcel is in state delivered (Fig. 1(d)), the
goal state of this case model.
In fCM, not all of the fragments have to be used for the execution of a
certain case to achieve the defined goal of the parcel delivery. Because fCM is
both data-driven and event-driven, it is very important to have a valid Object-
Lifecycle (OLC) which is consistent with the modeled fragments [8]. It describes
the state transitions of each data object as shown in Fig. 2.
3.1 Exceptions
Usually, we describe a discrepancy of a business process between the planned
flow and the reality as an exception. Nevertheless, a discrepancy does not always
have to be an exception. Therefore, Lohmeyer [12] distinguishes between in suc-
cess and failure regarding the goal of a business process. If the goal is achieved
although there is a deviation, we talk about a special variation which is already
known by the process and pre-defined. However, if a deviation leads to failure,
Lohmeyer talks about a real exception. It is a deviation which is unknown by
the process.
The difference between known and unknown exceptions traces back to Luo
et al. [13]: A deviation is unknown if it cannot be resolved with the rules of a
system that have been defined in advance. This means there is no alternative
path in the process model. Moreover, an unknown and therefore unexpected
exception cannot be handled according to [13].
Russell et al. [20] specify, additional to the distinction into known and
unknown exceptions, different exception types. This paper is based on this group-
ing and works with exceptions as a clearly identifiable event which occurs at run-
time of a business process. Three of those five types are exemplified in Table 1.
represented as solid arrows. This is the case, for example, when an exception is
triggered in state started. As a consequence, the activity changes into its state
failed. The difference to the dashed arrows, which also represent state transitions,
is that the transition is executed automatically whereas the transition using the
dashed arrows is deliberately enforced from the outside. A state of an activity
can change from started to failed by two different ways, the natural one or the
enforced one. In this paper, we concentrate on activities which are already in
state started 3 . Therefore Fig. 3 only shows the possible handling strategies for
activities which are halted due to an exception.
Like the example pattern SFF-CWC-COM, each pattern of Russel et al. [20]
is divided into three parts, each marked by an abbreviation which represents one
of the three main aspects of exception handling in workflow systems which are
defined as follows:
1. Handling of the activity which provoked the exception;
2. Handling of the following activities of the whole case; and
3. Recovery measures which are needed to remove the effects.
The handling of the activity which provoked the exception is summarized by
the first part of the pattern. In the example this corresponds to the abbreviation
SFF. The first three letters of each pattern has to be read separately, i.e. Started
Force-Fail, because the first letter explains in which state an activity is (cf.
Fig. 3) when an unknown exception occurred and an automatic state transition
can not be executed. The other two letters describe the status to which the
activity is manually transferred. So SFF = Started Force-Fail means that an
activity is halted in state started and transferred to state failed.
Whereas the first part of a 3-tuple pattern handles one activity, the second
part deals with the best strategy for handling on case-level, i.e. all of the fol-
lowing activities after the one which provoked the exception. This is necessary
because the occurred exception could affect some or all following activities. The
abbreviation by three letters describes one out of three possibilities for it:
1. Continue Current Case (CWC) - Every activity following will be executed
without any interruptions, the workflow still exists.
3
An overview of all possible state transitions can be found in [20].
Exception Handling in the Context of Fragment-Based Case Management 27
2. Remove Current Case (RCC) - Either a selection or all of the activities are
deleted.
3. Remove All Cases (RAC) - All cases of the same process model are deleted.
The last part of each pattern deals with recovery. Recovery describes the
action performed in order to remove any aftereffects of an exception to ensure
the possibility of still achieving the business goal. Three methods which are
presented in [20] can be used:
1. Do nothing (NIL)
2. Rollback (RBK) - The effects of the exception are reversed, i.e. the state of
the case is reset shortly before the time at which the exception occurred.
3. Compensate (COM) - The damage caused by the exception that has occurred
will be compensated.
to actively intervene the process by manipulating data and modeling new frag-
ments. So there are different varieties to handle a fragment in which an exception
has occurred. On the one hand, knowledge workers can start fragments manu-
ally and on the other hand, they can terminate them on purpose. It is up to the
knowledge worker whether the termination condition is evaluated as successful
or failed. Moreover, it is very important that the rules given by the model are
strictly adhered to. This means, a knowledge worker has the obligation to com-
pensate or withdraw the effects of an exception to ensure a correct execution to
achieve the business process goal. There are four options to do so:
1. Create new fragments
Creating new fragments can be helpful if an exception occurs and there is
no alternative path in the process model, an extra fragment can be modeled.
Nevertheless, it is important to check all of the following fragments to make
sure everything can be executed afterwards e.g. through a compliance check
explained in [10].
2. Delete existing fragments
If a selection of fragments may no longer be executed, they can be removed
by the knowledge worker. Note that a fragment is only removed in the current
case so that it is not lost across the whole instance.
3. Manipulation of states
There is a correlation between manipulation and the first two options. For
both, adding a new fragment and remove one, state transitions have to be
consistent. That is the reason why a continuous verification of the object live
cycle is necessary. Therefore, it is important that knowledge workers have the
right to put a data object in every state which is defined in the OLC. This
means they can also adapt the OLC itself. Moreover, this option is important
if you want to set back the execution of a fragment because you have to make
sure that the data objects have the right state.
4. Do nothing
This option should be chosen carefully and only if the effects of an exception
have no influence on other fragments.
A strict differentiation in compensation and rollback as proposed in tradi-
tional exception handling is not possible anymore. Knowledge workers have to
react situational. They have to decide for each exception individually if a roll-
back does make sense or not. So, the success of achieving a business goal lies
with the knowledge workers. Due to the complexity of a fCM business process
model, knowledge workers have to be more qualified in modeling skills than a
ACM-User and has to have more rights than a PCM-User to be able to add new
fragments or change states of data objects.
4.2 Notation
Exception handling on fragment-level is directly connected with the handling
of the activity which has provoked the exception and the handling on case-
level. That is the reason why an extension of the existent pattern for workflow
Exception Handling in the Context of Fragment-Based Case Management 29
systems is best suitable for a good exception handling in the context of fCM.
These pattern already suggest a strategy on how to handle an activity and the
following ones.
Our idea is to modify the given notation of Russell et al. [20] by adding
a fourth tuple element to the existing 3-tuple pattern. Amongst handling the
activity which triggers the exception, the case-level and the concrete recovery
measures, the notation is extended by exception handling on fragment-level.
This section explains the extension by defining specific rules because not every
extended pattern leads to meaningful exception handling strategies.
Like the first part of the given notation (i.e. the activity handling), the fourth
part is also an abbreviation consisting of three letters which have to be read
separately. The first letter specifies the further execution of the fragment in
which the exception has occurred. Suggested recovery measures are covered by
the last two letters. While the first aspect correlates directly with the handling
of the activity (first tuple element), the second topic is connected to the recovery
component which is the third part of the strategy tuple. This relation to the first
and third element of the tuple is the general rule of how to extend an existing
workflow pattern to fCM.
Whenever an exception occurs, it is important to decide whether a fragment
should be continued (C) or terminated (T) in its execution. This decision relates
to the handling of the activity which has provoked the exception. If the activity
is forced to state failed (e.g. SFF = Started Force-Fail), the fragment has to be
terminated manually by knowledge workers. Any further execution would endan-
ger a correct execution path regarding OLCs because any failed activity do not
trigger following activities within the same fragment. In contrast to that, there
are methods which still allow a continuation of the fragment. For example, SCE
(Started Continue-Execution) does not terminate any activity because there is
not state transition to failed (cf. Fig. 3) as an effect of the exception. Following
activities can therefore still be triggered. Table 2 shows this correlation of han-
dling an activity and a fragment for exceptions which has occurred in activity
state started.
This results in eight possible combinations (cf. Table 3). However, the nota-
tion of a pattern is always interpreted in the overall context. That means, a free
combination of the presented possibilities with the patterns of Russell et al. [20]
is not wise, but it is dedicated to some rules. Here comes the direct correlation
of the handling of an activity and the recovery measures into play.
Knowledge workers often have to design own fragments or modify existing ones to
implement the presented pattern. This has a huge implications on their process
modeling knowledge, but also on the verification of fCM at run-time. They have
to be adept in the area of the certain process to choose the best suitable pattern
for handling an exception if there are more than one possible strategies. These
enormously high demands on knowledge workers require a tool support to enable
the feasibility of the presented concept of exception handling in fCM.
32 K. Andree et al.
The concepts for exception handling in fCM described in the previous section
are exemplified in this section with the help of the last mile delivery fCM model
introduced in Sect. 2 as well as the exceptions described in Sect. 3.1.
First of all, the handling for the failure of an activity is discussed. This
may be the case when a recipient’s address does not exist. The algorithm for
optimizing the delivery tour cannot be executed because it cannot identify the
stop on a map. The activity Plan tour (Fig. 1(b)) is set to state failed and
while the case continues in execution the fragment which includes the failed
activity is terminated manually. Because there is no handling on the part of
the process model, knowledge workers would be notified. The mock-up in Fig. 5
shows how this could look like. The interface provides an overview of what
exception occurred and how it can be handled. Here, it is important that the
case itself continues in executing to ensure a successful delivery of all other
parcels. The parcel with the non existent address has to be scanned again or
transferred to a human who can then correct the address in the system. This
strategy conforms to the pattern SFF-CWC-RBK-TNT and SFF-CWC-COM-
TNE. Both ensure the continuation of the case, but while the first one suggests a
rollback by terminating the fragment and restart it, the second one recommends
compensation by creating a new fragment e.g. for manual input of the address.
Secondly, an exception of type Resource not available is discussed: the case
that no carrier can be found to deliver a given parcel. The third fragment can not
terminate successfully because the allocation of the planned tour failed. In this
case, the concept explained above suggests the pattern SFF-CWC-COM-TNE.
The case has to continue (CWC) because the parcel has to be delivered to achieve
the process goal. So, the best strategy to handle the situation is to compensate
the effects (COM), e.g. through modeling new fragments which allow parking
of parcels until a carrier is available. As an alternative, a new carrier could be
employed. Although this handling would work, the strategy is expensive because
ensuring a delivery within a two hour time slot can cost very much.
If the database is not reachable temporarily and does not receive any requests,
many fragments will not be triggered. For handling, the patterns SRS-CWC-
RBK-CNT and SRS-CWC-RBK-CNE are both possible. It is very useful, to
Exception Handling in the Context of Fragment-Based Case Management 33
restart the activity once it failed and moreover, there is no reason to stop the
case. A rollback is necessary to ensure the correct state of the needed data
objects, but on fragment-level a knowledge worker can either do nothing or add
a new fragment notifying an engineer to fix the problem.
6 Conclusion
The topic of exception handling in context of fragment-based Case Management
(fCM) during run-time is complex and requires a deep understanding of both, the
technical context and advanced skills in process modeling. This paper explains
a concept of how the strategies of Russell et al. [20] can be used for exception
handling in fCM by introducing the fragment-level. Due to an extension of the
notation by a fourth element, the handling of the fragment in which the exception
occurred can be defined as well as the handling of subsequent fragments can be
defined within one pattern. Each handling method for fCM can be mapped as
a quadruple. Special rules avoid duplication of semantics of the patterns [20]
and ensure direct integration into them. For each original exception handling
strategy, there are at least two extensions that can be used in a fCM application.
That is why exception handling in fCM is more powerful than in a control-flow
based system.
This paper provides a foundation that knowledge workers and developers of
an fCM application can use and build on to ensure efficient exception handling.
In the future, we want evaluate the usability of our approach by checking the
given requirements (see Sect. 4.3) with knowledge workers, as this is essential to
guarantee the effectiveness of our future implementation.
Acknowledgement. The research leading to these results has been partly funded by
the BMWi under grant agreement 01MD18012C, Project SMile. http://smile-project.
de.
34 K. Andree et al.
References
1. van der Aalst, W.M.P., Berens, P.J.S.: Beyond workflow management: product-
driven case handling. In: Proceedings of the 2001 International ACM SIGGROUP
Conference on Supporting Group Work, GROUP 2001, pp. 42–51. Association for
Computing Machinery, New York (2001)
2. Agostini, A., De Michelis, G.: Improving flexibility of workflow management sys-
tems. In: van der Aalst, W., Desel, J., Oberweis, A. (eds.) Business Process Man-
agement. LNCS, vol. 1806, pp. 218–234. Springer, Heidelberg (2000). https://doi.
org/10.1007/3-540-45594-9 14
3. Dumas, M., La Rosa, M., Mendling, J., Reijers, H.A.: Fundamentals of Business
Process Management. Springer, Heidelberg (2018). https://doi.org/10.1007/978-
3-662-56509-4
4. Fahland, D.: From scenarios to components. Ph.D. thesis, Humboldt University of
Berlin (2010)
5. Fahland, D., Woith, H.: Towards process models for disaster response. In: Ardagna,
D., Mecella, M., Yang, J. (eds.) BPM 2008. LNBIP, vol. 17, pp. 254–265. Springer,
Heidelberg (2009). https://doi.org/10.1007/978-3-642-00328-8 25
6. Gonzalez-Lopez, F., Pufahl, L.: A landscape for case models. In: Reinhartz-
Berger, I., Zdravkovic, J., Gulden, J., Schmidt, R. (eds.) BPMDS/EMMSAD -2019.
LNBIP, vol. 352, pp. 87–102. Springer, Cham (2019). https://doi.org/10.1007/978-
3-030-20618-5 6
7. Hauder, M., Pigat, S., Matthes, F.: Research challenges in adaptive case man-
agement: a literature review. In: 2014 IEEE 18th International Enterprise Dis-
tributed Object Computing Conference Workshops and Demonstrations, pp. 98–
107, September 2014
8. Hewelt, M., Pufahl, L., Mandal, S., Wolff, F., Weske, M.: Toward a methodology
for case modeling. Softw. Syst. Model. 2019, 1–27 (2019). https://doi.org/10.1007/
s10270-019-00766-5
9. Hewelt, M., Weske, M.: A hybrid approach for flexible case modeling and execution.
In: La Rosa, M., Loos, P., Pastor, O. (eds.) BPM 2016. LNBIP, vol. 260, pp. 38–54.
Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45468-9 3
10. Holfter, A., Haarmann, S., Pufahl, L., Weske, M.: Checking compliance in data-
driven case management. In: Di Francescomarino, C., Dijkman, R., Zdun, U. (eds.)
BPM 2019. LNBIP, vol. 362, pp. 400–411. Springer, Cham (2019). https://doi.org/
10.1007/978-3-030-37453-2 33
11. Kurz, M., Fleischmann, A., Lederer, M., Huber, S.: Planning for the unexpected:
exception handling and BPM. In: Fischer, H., Schneeberger, J. (eds.) S-BPM ONE
2013. CCIS, vol. 360, pp. 123–149. Springer, Heidelberg (2013). https://doi.org/
10.1007/978-3-642-36754-0 8
12. Lohmeyer, B.: Writing Use Cases: Exception or Alternate Flow? Lohmeyer Busi-
ness UX (2013). https://www.lohmy.de/2013/03/06/writing-use-cases-exception-
or-alternate-flow/. Accessed 28 Feb 2020
13. Luo, Z., Sheth, A., Kochut, K., Miller, J.: Exception handling in workflow systems.
Appl. Intell. 13, 125–147 (2000). https://doi.org/10.1023/A:1008388412284
14. de Man, H.: Case management: a review of modeling approaches. Technical report,
BPTrends (2009)
15. Marin, M.A., Hauder, M., Matthes, F.: Case management: an evaluation of existing
approaches for knowledge-intensive processes. In: Reichert, M., Reijers, H.A. (eds.)
BPM 2015. LNBIP, vol. 256, pp. 5–16. Springer, Cham (2016). https://doi.org/10.
1007/978-3-319-42887-1 1
Exception Handling in the Context of Fragment-Based Case Management 35
16. Motahari-Nezhad, H.R., Swenson, K.D.: Adaptive case management: overview and
research challenges. In: 2013 IEEE 15th Conference on Business Informatics (CBI),
pp. 264–269. IEEE (2013)
17. OMG: Notation BPMN Version 2.0. OMG Specification, Object Management
Group, pp. 22–31 (2011)
18. Pufahl, L., Ihde, S., Glöckner, M., Franczyk, B., Paulus, B., Weske, M.: Countering
congestion: a white-label platform for the last mile parcel delivery. In: Business
Information Systems 2020. Springer, Cham (to be published)
19. Ranieri, L., Digiesi, S., Silvestri, B., Roccotelli, M.: A review of last mile logis-
tics innovations in an externalities cost reduction vision. Sustainability 10(3), 782
(2018)
20. Russell, N., van der Aast, W., ter Hofstede, A.: Exception handling patterns in
process-aware information systems. Technical report. Queensland University of
Technology/Eindhoven University of Technology(2006)
21. Weber, B., Reichert, M., Rinderle-Ma, S.: Change patterns and change support
features - enhancing flexibility in process-aware information systems. Data Knowl.
Eng. 66(3), 438–466 (2008)
Business Process Monitoring
on Blockchains: Potentials and Challenges
1 Introduction
The goal of this paper is to clarify to what extent a blockchain can be bene-
ficial for business process monitoring. On this basis, the paper identifies a set of
research challenges that are worth to be addressed by the research community
for the design and realization of blockchain-based process monitoring platforms.
The remainder of this paper is as follows. Section 2 describes the fundamen-
tal elements of process monitoring. Section 3 describes the concepts on which
blockchain platforms are based and illustrates the main research conducted so
far for the process-oriented analysis of blockchain data. Section 4 examines the
challenges and opportunities we envision for a blockchain-based process moni-
toring architecture. Finally, Sect. 5 concludes the paper.
Manufacturer
Attach
Fill in Inspect
container to
container container
truck
Manufacturer portion Truck reached Container attached Manufacturer portion
started manufacturer to truck ended
Drive to Ship
Shipper
manufacturer container
Shipper portion Truck reached Container attached Truck reached Container Shipper portion
started manufacturer to truck Inland terminal delivered ended
24h Notify delay
Truck Truck
[customer,still] [manufacturer,moving]
Customer
Detach
container
from truck
Customer portion Truck reached Container Terminal portion
started customer delivered ended
Business process monitoring aims at identifying how well running processes are
performing with respect to performance measures and objectives. Depending on
the available tools and data, a business process platform can report on the run-
ning processes, from the sole tracking of the running instances to the checking of
deviations with respect to the expected behaviour and the identification of other
anomalies. This section briefly introduces the main characteristics of business
38 C. Di Ciccio et al.
Why to Monitor. There are several reasons why a monitoring platform should
be introduced. As a general need, the process owner and the recipients are inter-
ested in verifying and demonstrating that the process is behaving correctly. A
monitoring platform can be a passive element that merely records the performed
actions, or it can actively contribute to handle the occurring deviations.
Moreover, the objectives of a business process monitoring platform can be
various: to determine if activities take longer than expected to complete, if there
are bottlenecks in the process, if resources are under- or over-utilized, and if
there are violations in the process execution, among other things. Depending
on the needs of the process owner, all – or a subset – of these aspects can be
considered.
Conformance checking consists in the techniques that compare the modeled pro-
cess behavior with the one evidenced by execution data. To this end, the gathered
event data are replayed on the process model, so as to detect deviations from
the expected behaviour. Given a process model and event data, Conformance
checking produces conformance-related diagnostic information.
With conformance checking, the stakeholders can verify if the execution is
in line with the process description. In particular, the nature of the model plays
an important role in defining the degrees of freedom that are left to the process
executors. A collaboration diagram (e.g., the complete collaboration diagram
in Fig. 1) will force the whole process to strictly adhere to the specifications.
A process diagram (e.g., only the portion of the process inside a specific pool)
will force the process portion belonging to that stakeholder to adhere to the
specifications. Finally, a choreography diagram will force only the interactions
among stakeholders to adhere to the specifications, leaving the stakeholders free
to alter their internal processes.
portions of the process, rather than on the process as a whole, it is much easier
for stakeholders to agree on monitoring them. In fact, only activities required
for the assessment of such constraints have to be disclosed, thus overcoming one
of the issues of conformance checking.
backward linking: every block keeps the digest of a hashing function applied to
the previous block. All together, the links generate a chain-like structure: hence
the name blockchain. Locally to a node, transactions are subject to a total order-
ing relation: the evolution of the state of the parties’ accounts depend on the
sequence of operations recorded in the ledger. Blocks are, in fact, a measure of
time as their addition to the chain determines the passage to the next global
system state. To pay back the effort of nodes, an economic incentive is pro-
posed that distributes so-called cryptocurrencies to the nodes that publish the
accepted blocks. Nodes participating in the network guarantee that transactions
and blocks are valid and thus prevent the data structure to be tampered with.
Also, the replication of the ledger makes it possible to have the stored informa-
tion always available locally to every node. However, the ledger may differ from
node to node: the nodes reach eventual consensus on the correct sequence in the
ledger. Temporary divergences between the local images of the ledger are called
forks. The way in which access and right to write are granted, determine two
main categorisations of the blockchain platform in use: private blockchains are
accessible only to a restricted number of peers, as opposed to the public ones; if a
selected number of participants only is allowed to decide on the next blocks, the
blockchain is permissioned, otherwise it is permissionless. Natively, Bitcoin and
Ethereum are natively public permissionless blockchain, although for the latter
private networks can be created that operate within conortia, allowing only a
subset of nodes to mine blocks. Hyperledger Fabric,1 instead, is conceived as a
consortium (private) permissioned blockchain.
Second-generation blockchains such as Ethereum and Hyperledger Fabric
support the so-called smart contracts [18], that is, executable code expressing
how business is to be conducted among contracting parties (e.g., transfer digital
assets after a condition is fulfilled). In this paper, we will focus on this kind of
blockchains operating as distributed programmable platforms. Smart contracts
often require data from the world outside the blockchain sphere (e.g., finan-
cial data, weather-related information, random numbers, sensors from hardware
devices). However, they cannot directly invoke external APIs. Therefore, smart
contracts need software adaptors that play that interfacing role. Those arte-
facts are named oracles [22]. Oracles can be further classified as software or
hardware oracles. Software oracles aim to extract information from programmed
applications (e.g., web services), whereas hardware oracles extract data from the
physical world (e.g., IoT devices).
To date, preliminary attempts have been proposed that can be the basis to be
built upon for process monitoring in the blockchain. Smart contracts allow for
the codification of business process logic on the blockchain, as shown in the
seminal work of Weber et al. [20]. Later, a similar approach has been applied
within the Caterpillar [9] and Lorikeet [19] tools, as well as by Madsen et al.
1
https://www.hyperledger.org/projects/fabric.
Business Process Monitoring on Blockchains: Potentials and Challenges 43
blockchain
Process ArƟfacts
status status
of the obtained input data into aggregated information that is more meaningful
to the analysis (e.g., at a higher level of abstraction than low-level events). A por-
tion of this output can be kept private (private monitoring data) or made public
(public monitoring data), to let other interested parties to check, for instance,
the compliance of the process.
When enriching the monitoring platform with a blockchain, the monitoring
logic may be encoded in one or more smart contracts. First of all, this requires
that the monitoring system includes a blockchain client which enables the com-
munication with the rest of the blockchain infrastructure. Secondly, as the output
of a smart contract is published as a transaction payload on the blockchain, the
resulting monitoring data produced by the smart contract is automatically avail-
able to anyone allowed to access the blockchain. This implies that the monitoring
logic implemented as smart contract must be limited to the part producing public
monitoring data. On the one hand, this opportunity increases the transparency
of the monitoring and the possibility for external actors to evaluate the behaviour
of the process, as the smart contract is immutable and executed on all the nodes
in the blockchain network. On the other hand, since the publication of data on
the blockchain has an impact in terms of cost and performance, it becomes of
utmost importance to establish which monitoring data can be included in the
blockchain (i.e., on-chain thus trusted by definition), and which one can be left
off-chain as typical public monitoring data (i.e., only under the control of the
party producing them).
The distinction between data on- and off-chain is relevant not only when
considering the output of the monitoring platform, but also concerning the input
of a smart contract as they can natively operate only on data published on the
blockchain. To overcome this limitation, blockchains offer oracles to extend the
smart contract accessibility to off-chain data. For this reason, it is required that
any dataset to which a smart contract requires access for its computation, needs
Business Process Monitoring on Blockchains: Potentials and Challenges 45
Observability. Although both the smart contracts and the invocations of its
methods are stored in the blockchain, and their execution can be performed
and analyzed by any participant, most blockchains require smart contracts to
explicitly define methods to retrieve their information. In other words, variables
that are used by smart contracts are accessible only by the smart contract itself,
unless methods to make their contents available are explicitly defined in the
smart contracts design. As a consequence, before putting in place a blockchain-
based monitoring platform, care should be taken defining which information can
be retrieved from the smart contract. For example, suppose that, to monitor the
process in Fig. 1, a smart contract is implemented that has an internal represen-
tation of the process and of the status of each activity. That smart contract may
Business Process Monitoring on Blockchains: Potentials and Challenges 47
expose a function to check whether the process conforms to the model or not,
without providing information on the activities. As a consequence, although the
smart contract internally knows that, e.g., Ship container is running and Attach
container to truck is complete, it would lack a way to communicate this infor-
mation to other smart contracts or other participants, which cannot rely on it
to determine the status of the process and its activities.
To mitigate this issue, one could “debug” a smart contract by tracing the exe-
cution of each transaction since when the deployment took place, thus identifying
the variables and how they change over time, similarly to the approach of Duch-
mann and Koschmider [5]. However, if the discovered information is required
by another smart contract, this information should be provided off-chain even
though it originated on-chain, with consequent trust issues and the need, once
again, to rely on an oracle.
Time Management. Among the several aspects that are interesting to monitor
about a business process, one of the most pivotal is checking if an activity, or a
group of them, is performed on time. Nevertheless, implementing a smart con-
tract able to verify this condition could be cumbersome as a blockchain lacks
a notion of time aside from the coarse-grained block time [14]. More in detail,
although a blockchain sorts the transactions, it cannot deal with timers. This
is due to the fact that the expiration of a timer, or more simply a clock-ticking
event, would be an action that originates from the smart contract itself. How-
ever, as a smart contract can only perform actions that are externally invoked,
such actions cannot be performed without the help of an external entity. For
example, suppose that a smart contract is adopted to check whether activity
Ship container is executed on time. That smart contract cannot determine that
activity Ship container took longer than 24 h until it receives a notification that
the activity was completed, unless it is actively polled by an external entity.
48 C. Di Ciccio et al.
For this reason, time must be managed externally to the blockchain by means
of specific oracles which must be configured by the smart contract to send a
trigger whenever a timeout expires. It is also important to consider that those
oracles are external to the blockchain by definition, hence outside the chain of
trust managed by the blockchain. For this reason, when designing a time oracle,
the situation in which the oracle experiences a failure or produces fake data
(e.g., it goes out of sync) must be taken into consideration. To mitigate this
issue, oracles may integrate time synchronization protocols.
Mechanisms for enabling late binding of oracles to smart contracts are thus
desirable for a proper design. Notice that late binding would also tackle prob-
lems of reliability. Without that mechanism in place, an oracle that is no longer
available cannot be replaced.
Data Size. In a blockchain, the larger the amount of stored data is, the more
expensive the transaction gets. This simple rule has a significant impact on
monitoring costs. Indeed, in the initial approaches [8,14], all the data that could
be useful for monitoring were supposed to be stored on-chain. Nevertheless, to
reduce these costs, care should be taken in the design of the smart contract to
minimize the amount of on-chain information to the sole data that are required
to perform monitoring [8]. To this aim, distributed file systems such as IPFS2
can be adopted to store the entire monitoring data set. Then, the transaction
only includes a link to externally stored data, and a hash value computed to
guarantee immutability. However, as smart contracts cannot natively retrieve
and process off-chain data, this could imply that oracle-mediated operations are
required again.
Side Effects. Most blockchains are prone to soft forks, i.e., branches in the chain
of blocks caused by two or more blocks pointing to the same predecessor. To solve
ambiguities, blockchain clients consider as valid the longest chain, that is, the
one having the highest number of subsequent blocks originating from the point of
forking. From a monitoring standpoint, this lack of information consistency is an
issue, since valid monitoring data may not be considered as the block containing
them happen to lie on a discarded post-fork branch.
Aside from soft forks, public blockchains such as Ethereum are also prone
to so-called hard forks. In case a change in the consensus protocol is made
2
Interplanetary File System (IPFS), https://ipfs.io.
50 C. Di Ciccio et al.
– for either technical or political reasons – some participants may not accept
it. Unlike soft forks, hard forks cause a split in the blockchain network, which
hampers interoperability. From the monitoring standpoint, hard forks may break
the platform if some participants decide not to migrate to the new protocol.
5 Conclusion
Throughout this paper, we have discussed the advantages and challenges that
come along the interplay between blockchain data and process analysis for mon-
itoring. Despite the growing interest in the adoption of blockchain technologies
for process execution environments, research in that direction is still at its early
stages. Considering a reference architecture for the realisation of blockchain-
based process monitoring, we have focused on the role that smart contracts,
oracles and data management strategies play, in pursuit of a fruitful discussion
in the community that drives the adoption of blockchain in process monitoring.
References
1. van der Aalst, W.M.P.: Business process management: a comprehensive survey.
ISRN Softw. Eng. 2013(507984), 37 (2013)
2. Beyer, J., Kuhn, P., Hewelt, M., Mandal, S., Weske, M.: Unicorn meets Chimera:
integrating external events into case management. In: Proceedings of the BPM
Demo Track, pp. 67–72 (2016)
3. Cappiello, C., Comuzzi, M., Daniel, F., Meroni, G.: Data quality control in
blockchain applications. In: BPM (Blockchain and CEE Forum), pp. 166–181
(2019)
4. Di Ciccio, C., et al.: Blockchain support for collaborative business processes. Infor-
matik Spektrum 42, 182–190 (2019)
5. Duchmann, F., Koschmider, A.: Validation of smart contracts using process mining.
In: ZEUS, pp. 13–16 (2019)
6. Filtz, E., Polleres, A., Karl, R., Haslhofer, B.: Evolution of the bitcoin address
graph. In: Haber, P., Lampoltshammer, T., Mayr, M. (eds.) Data Science - Ana-
lytics and Applications, pp. 77–82. Springer, Wiesbaden (2017). https://doi.org/
10.1007/978-3-658-19287-7 11
7. Haslhofer, B., Karl, R., Filtz, E.: O bitcoin where art thou? Insight into large-scale
transaction graphs. In: SEMANTiCS (Posters, Demos) (2016)
8. Klinkmüller, C., Ponomarev, A., Tran, A.B., Weber, I., van der Aalst, W.: Mining
blockchain processes: extracting process mining data from blockchain applications.
In: BPM (Blockchain and CEE Forum), pp. 71–86 (2019)
9. López-Pintado, O., Garcı́a-Bañuelos, L., Dumas, M., Weber, I., Ponomarev, A.:
Caterpillar: a business process execution engine on the Ethereum blockchain. Sofw.
Pract. Exp. 49(7), 1162–1193 (2019)
Business Process Monitoring on Blockchains: Potentials and Challenges 51
10. Ly, L.T., Maggi, F.M., Montali, M., Rinderle-Ma, S., van der Aalst, W.M.P.:
Compliance monitoring in business processes: functionalities, application, and tool-
support. Inf. Syst. 54, 209–234 (2015)
11. Madsen, M.F., Gaub, M., Høgnason, T., Kirkbro, M.E., Slaats, T., Debois, S.:
Collaboration among adversaries: distributed workflow execution on a blockchain.
In: FAB, pp. 8–15 (2018)
12. Mendling, J., et al.: Blockchains for business process management - challenges and
opportunities. ACM Trans. Manag. Inf. Syst. 9(1), 4:1–4:16 (2018)
13. Meroni, G., Baresi, L., Montali, M., Plebani, P.: Multi-party business process com-
pliance monitoring through IoT-enabled artifacts. Inf. Syst. 73, 61–78 (2018)
14. Mühlberger, R., Bachhofner, S., Di Ciccio, C., Garcı́a-Bañuelos, L., López-Pintado,
O.: Extracting event logs for process mining from data stored on the blockchain.
In: Di Francescomarino, C., Dijkman, R., Zdun, U. (eds.) BPM 2019. LNBIP,
vol. 362, pp. 690–703. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-
37453-2 55
15. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008). https://
bitcoin.org/bitcoin.pdf
16. Prybila, C., Schulte, S., Hochreiner, C., Weber, I.: Runtime verification for business
processes utilizing the bitcoin blockchain. In: FGCS (2017)
17. Reichert, M., Weber, B.: Enabling Flexibility in Process-Aware Information Sys-
tems - Challenges, Methods, Technologies. Springer, Heidelberg (2012). https://
doi.org/10.1007/978-3-642-30409-5
18. Szabo, N.: Formalizing and securing relationships on public networks. First Monday
2(9) (1997). https://firstmonday.org/ojs/index.php/fm/article/view/548
19. Tran, A.B., Lu, Q., Weber, I.: Lorikeet: a model-driven engineering tool for
blockchain-based business process execution and asset management. In: BPM
Demos, pp. 56–60 (2018)
20. Weber, I., Xu, X., Riveret, R., Governatori, G., Ponomarev, A., Mendling, J.:
Untrusted business process monitoring and execution using blockchain. In: La
Rosa, M., Loos, P., Pastor, O. (eds.) BPM 2016. LNCS, vol. 9850, pp. 329–347.
Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45348-4 19
21. Wood, G.: Ethereum: a secure decentralised generalised transaction ledger (2018).
https://ethereum.github.io/yellowpaper/paper.pdf
22. Xu, X., Weber, I., Staples, M.: Architecture for Blockchain Applications. Springer,
Cham (2019). https://doi.org/10.1007/978-3-030-03035-3
BPM Applications in Industry and
Practice (BPMDS 2020)
Factors Impacting Successful BPMS Adoption
and Use: A South African Financial Services
Case Study
1 Introduction
Business process (BP) agility is defined as the organisation’s ability to swiftly alter their
BPs in response to changes in the market [1], and is important for competitiveness. Yet,
BP agility is challenged by rapidly evolving technologies and business environments [2,
3]. To achieve BP agility, BP management software, also referred to as BP Management
Suites (BPMS) is often combined with various information technology (IT) architectures,
such as service oriented architecture (SOA) [4, 5].
BPMS solutions, packaged as a single solution, are collections of software such as
graphical modelling tools, process analysis tools, orchestration engines and integration
platforms [6]. Software tools earlier described as workflow, business intelligence, rules
engines, or enterprise application integration tools are now integrated into BPMS prod-
ucts. BPMS and SOA are seen as two sides of the same coin [7]. In 2012, Gartner defined
the BPMS market as one of the most rapid growing markets within the IT industry [8].
© Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 55–69, 2020.
https://doi.org/10.1007/978-3-030-49418-6_4
56 A. Koopman and L. F. Seymour
2 Literature Review
Innovation adoption is defined as the decision of an individual or organization to use
an innovation. Hence, organisational adoption of a technology, includes but is much
broader than individual technology adoption [14]. Usage of the technology follows the
adoption decision. There has been a call from researchers for a more holistic approach to
studying adoption which is shown in the call for papers of the ECIS 2019 IS innovation
and adoption track [15]. The predominant research approach focuses on variables that
contribute to individual adoption and is said to distance researchers from practitioners
[15]. A more holistic approach is the Alter systems theory of IT innovation, adoption
and adaption [16], referred to as the work system approach.
A work system is seen as a natural unit of analyzing the adoption of socio-technical
systems in organisations [15] and comprises four elements: processes and activities;
participants; information; and technology. Furthermore, the work system needs to be
considered in terms of the products and services it produces; the customers it serves; the
organizational environment, strategies and infrastructure. Alter notes that often orga-
nizational IT is mandated, but there are post-adoption environments where employees
might not comply with prescribed business processes and/or IT usage patterns. It is also
noted that this area of research although highly relevant is under-researched. A systems
approach sees the entity being adopted as not the technology but the information system
or work system which comprises the people, processes and information, in addition to
the technology [15, 16]. Hence when looking at successful BPMS adoption and use, one
needs to look more broader, in terms of adoption and use of the relevant BP management
(BPM) practices and processes that support adoption and use of the BPMS technology.
Factors Impacting Successful BPMS Adoption and Use 57
Table 1. Mapping Core BPM, work system framework and socio-technical change elements.
Core BPM elements Socio-technical change elements Work system framework elements
Strategic alignment Strategies
Methods Task Processes and activities
Information, products and
services
IT Technology Technologies and infrastructure
People Actors Participants and customers
Governance and culture Structure Environment
The core BPM elements have been confirmed in some empirical studies. For example,
one study noted that to achieve effective BPM solution implementation, the following
needs to be achieved: the organisation should have adequate IT infrastructure to sup-
port a process orientated architecture; individuals within the organisation should have
a comprehensive understanding of process orientated frameworks; and the organisation
should have an effective change management process regarding software changes [21].
Successful BPM has been found to depend on employees’ attitudes towards embracing
BP change [22], people change management can be extremely challenging [23] and BPM
projects frequently fail due to cultural issues [24]. It is also suggested that IT capabilities
need to ensure BP efficiency as opposed to rudimentary BP automation [25]. Ensuring
that the appropriate tools and IT infrastructure is in place for BPM has also been seen to
be critical [17, 26]. While adopting a BPMS is intended to produce agile BPs, it needs to
be acknowledged that IT can be both an enabler and a disabler for business agility [3].
58 A. Koopman and L. F. Seymour
Factors that contribute to inflexible IT solutions include: insufficient capacity and project
priorities of IT staff members, traditional architectures and the complexity of integrat-
ing with legacy applications within the organization, and poor interfacing capabilities
of legacy applications [3].
Data was analysed using as soon as collected and prior to conducting further inter-
views. This allowed questions to be amended based on the themes that emerged. As data
was iteratively analysed, new themes emerged. The Attride-Stirling [28] six step induc-
tive method of thematic analysis was followed: 1) Coding the text; 2) Identifying themes;
Factors Impacting Successful BPMS Adoption and Use 59
3) Developing the thematic network; 4) Describing and exploring the network; 5) Sum-
marising the network; 6) Interpreting patterns emerging from the data. The core BPM
element framework was used during thematic analysis as a lens to classify the themes
that emerged from the case study. The thematic network and themes are presented in the
findings section.
BigFin is a well-established organisation operating in the investment and insurance
industry, listed on various stock exchanges and a constituent of the Financial Times
Stock Exchange (FTSE) 100 index. BigFin was selected as it has an established BPM
IT team supporting a BPMS. As BigFin was established many decades ago it has a very
large legacy IT estate. BigFin consists of various business units with their own strategies,
budgets and visions and has multiple projects that run concurrently. The BPMS was first
implemented during 2014 with the assistance of the vendor, and was implemented on
Microsoft .NET and Microsoft SQL platforms using the native interfaces of the BPMS
tool. This allows multiple integration points to other applications within BigFin via the
SOA layer. An architectural review noted that the BPMS has the capabilities to be used
as a strategic solution within BigFin to improve BP agility, scale their infrastructure
and accommodate high user concurrency (D3). Management believed that the BPMS
is a vital enabler for attaining a more client centric and process-oriented approach to
business (D2). Improved reporting, segregation of duties and a clear audit trail were also
cited benefits (D2). The main purpose of the BPMS was to model business processes,
automate process steps, integrate with applications and manage workflow, mainly for
enterprise wide processes (D3). The BPMS was also expected to improve BigFin’s BPM
maturity.
Strategy can apply to the organisation, the department and the work system itself. The
work system framework stresses the importance of these being aligned [29]. The work
system strategies should support departmental strategies and ultimately organisational
strategies. Three strategic alignment themes emerged.
BPM and Business as Usual are not Strategic Priorities. Strategic priorities of
BigFin were impacting budget allocation. I1 highlighted that legislative requirements
in BigFin take the highest priority, followed by strategic projects and then business as
usual and other BPM IT projects. D1 and D2 noted that key resources are utilised by
strategic projects, leaving few resources available for BPM migration projects. I4 noted
60 A. Koopman and L. F. Seymour
that the current strategic priority is delivering a new product into the market which is
at the expense of setting up a BPM centre of excellence that would govern processes
implemented on the BPMS. I2 reiterated this as he stated, “One of the challenges is
when we have these strategic initiatives the business as usual improvements and agility
fall by the wayside.”
Legacy System Strategy Misaligned with BPM Strategy. It was noted that the strat-
egy for legacy systems was misaligned with the BPM strategy. A2 noted that the strategy
clearly defines the core IT architectures that BigFin requires to become an agile enter-
prise. This entails defining where the enterprise is now, what the roadmap is to their
desired strategy outcome and what the potential hurdles are from achieving the desired
outcome. These hurdles come in the form of licences for software products the BPMS
integrates with which have been bought for a defined period. As a result, decisions have
been made by senior management to utilize these software products until the licences
expire as they have already been paid for. This creates obstacles to achieving agility
within business processes as alternative solutions cannot be implemented until software
licences have expired.
Lack of BPM Strategic Vision. Lack of BPM strategic vision was identified by three
of the interviewees as a factor contributing to the lack of BP agility. I3 noted that there was
no central directive within BigFin regarding the BPM strategy although a BPM centre of
excellence would assist in formalising the BPM vision within BigFin to provide efficient
and effective BPM which would support BP agility. Organisations that implement a BPM
centre of excellence offer consistent and cost-effective BPM services and can adopt a
project portfolio management approach to BPM enabling IT teams to implement agile
BPs [30].
respect to niche processes, A4 explained that one business unit had five different prod-
uct lines but seventy-eight different implementations on the BPMS because every time
a new line of business or a new type of customer came on board, they developed niche
for that scenario. These decisions were seen to be driven by a lack of alignment between
the BPM and legacy system strategies.
Misalignment Between Business and BPM IT Teams. The participants of the inter-
view process addressed issues such as overdesign of solutions, silos within BigFin that
operate in isolation and business units not willing to change the way they conduct their
business processes. Alter highlights that all components within a work system should
be aligned [29]. A1 noted that there was misalignment between the different develop-
ment teams and business in terms of what they were expected to deliver. Change on the
business unit side of the process resulted in IT staff having to work differently and think
differently and, in this case, there was a lack of process thinking in the IT teams. I2 reit-
erated this by stating, “Arguably the biggest challenge is the business change in thinking.
Don’t just own your users’ tasks, own a process end to end.” Not having a BPM centre
of excellence or process architects that govern process design and implementation was
another factor that contributed to the misalignment between business teams and BPM
IT teams. It seemed that this lack of alignment was being driven by the lack of a central
BPM strategy and BPM not being a strategic priority.
Budget Allocation for BPM Business As Usual Initiatives. While BigFin appeared
to support strategic projects well, it was noted that as soon as a project shifts from
the build phase into the support or business as usual phase after implementation, the
funding for that solution is no longer available. A2 note that, “The problem within
BigFin is when a project is in a project phase there is money available but as soon as it
flips over to the BAU phase there’s no money. What that means is that just enough money
is supplied to the project to keep it running. There is no additional funding supplied to
grow it. So that is probably the single biggest challenge that we have.” As no addi-
tional funding is suppled for continuous process improvement, agility within business
processes are sacrificed. Hence while the BPM IT team sees the potential benefits that
solution changes will provide, they cannot implement them as no funds are available.
The budget allocation was clearly being impacted by the lack of a central BPM strategy
and BPM not being a strategic priority.
Under IT impacts one dominant theme emerged and that was the lack of agility due to
integrating with legacy and external software applications. Integration is the predominant
theme that emerged as it was addressed by seven of the eight interviewees. Integration
complexities range from interactions with legacy applications which can’t be changed or
have incomplete data, the tightly coupled nature of the legacy application integration and
complexities regarding integration outside BigFin’s secure network. Changes to appli-
cations or web services that are consumed by the BPMS impact the time to deliver a
process implemented on the BPMS. D3 confirmed that the BPMS solution has multiple
integration points via BigFin’s web service integration layer. If a change is required for
a web service that is consumed by several other applications, extensive impact analysis
needs to be performed in order to determine if the required change for the BPM project
poses a risk to the other applications. I4 referred to the increase in required analysis,
design, governance and testing when altering processes integrated with other applica-
tions. A3 stated that a problem with integrating with legacy applications is that some of
them do not have REST and SOAP capabilities. They only offer point to point tight inte-
gration which creates tightly coupled solutions. The literature confirms that with tightly
coupled IT solutions that integrate business processes across various disparate software
applications even the smallest of changes become time consuming with a degree of risk
[35]. “Being so highly integrated sometimes I worry it is not necessarily enabling us to
change quickly” (I4).
Integration is also impacted by data incompleteness and an inability to change legacy
applications and security concerns with external applications. A1 highlighted that data
governance was limited when many legacy applications were developed which impacts
the accuracy and completeness of the data. This has agility implications if certain data
validations need to be introduced within a process. I2 noted BigFin’s resistance to invest
funds in aging legacy applications which the BPMS integrated with impacted use of
the BPMS and diminished process agility as legacy solutions will not be changed. A3
noted that when integrating with applications outside of BigFin’s secure network, various
considerations need to be made in terms of establishing secure communication channels
and if an external party changes the application, the process of implementing secure
integration channels needs to be repeated.
4.4 People
The BP people category refers to the individuals and groups that improve BPs [19]. The
dominant theme was found to be resourcing constraints for BPM initiatives. Resourcing
constraints in this study refers to the limited time that IT staff from application teams,
the BPM IT team and architecture teams can spend on the BPM projects and the inability
to staff the BPM IT team. This is an area of concern within BigFin as five out of the
eight interviewees raised staff resourcing as a challenge for the BPM IT team.
The BigFin BPM IT team appeared to be constantly understaffed. I2 confirmed that
finding high calibre resources that are technically capable and who possess the business
understanding proved to be a challenge for the BPM IT team. A2 noted that developers
do not find BPM development appealing as the skill is perceived to be a niche technology
Factors Impacting Successful BPMS Adoption and Use 63
skill not broadly utilized. These developers would rather work on pure object orientated
languages like java or C#.
Key resources within the BPM IT team face similar situations in terms of resource
constraints. Training and on-boarding of new staff is fundamental to ensure knowledge
sharing and continuity when key resources leave. I4 stated that senior staff members
are responsible for all new staff on-boarding within the BPM IT team and are also
responsible for all BPMS design documentation and their review. because of high staff
turnover, they have very little time to maintain existing processes to ensure agility is
retained.
A3 also indicated that it is extremely difficult to deliver BPM processes in an inte-
grated environment without constant interaction with the various application teams.
These interactions involve consulting with the various teams to ensure they are aware
of how to integrate with the BPMS. These consultations are not just from a techni-
cal perspective but also relate to standards and governance. This is a further resource
drain on the IT resources within the BPM IT team. The resource constraints of other
teams ultimately affect delivery for the BPM IT team. I4 highlighted that project and
hence resourcing priorities of the IT teams that support applications that integrate with
the BPMS may not be aligned and therefore BP change cannot be implemented until
resources are allocated.
People are known to be assigned to roles and project teams based on manager’s
experience of people, their availability and the required skills [36]. Often projects tend
to draw on a common resource pool within the organisation [37]. As large organisations
run multiple projects concurrently, obtaining time from the most valuable resources is
challenging. A2 noted that he is often one of the resources whose time is debated over.
He is allocated to BigFin’s number one strategic project where he needs to provide input
in terms of the overall program architecture. However, there are also other strategic
projects that require his attention. As a result, he does not have much time to spend with
the BPM IT team or on continuous process improvement hence BPMS usage suffers.
4.5 Methods
The BPM method category refers to methods specific to the BP lifecycle [19]. Two
themes emerged under the method category.
Bypassing Design and Code Approval Procedures. In addition to the pre-project ini-
tiation processes, approval is needed for the artefacts that are produced. It appeared
that these approval procedures were being bypassed and this concern was referred to
64 A. Koopman and L. F. Seymour
the most by interviewees (14 mentions). I1 stated that problems arise when developers
or project managers try and rush projects into the production environments by trying
to bypass sign-off procedures resulting in inflexible and niche BPs. It seemed that the
desire for niche BPs in many cases was driven by the culture which was resistant to
change. Sign-off processes could also have been implemented more efficiently if proper
governance had been completed. Pre-production approval processes come in the form
of security sign-off and code review sign-off and ensure that final approval is obtained
from a design, technology, quality assurance, architecture, risk and security perspective.
Although these processes are necessary for implementing efficient and agile BPs in the
long term, the interviewees noted that these processes also hamper quick delivery of BP
changes in the short term.
This research aimed to identify factors that affect the successful adoption and use of the
BPMS by the IT team and success was defined as achieving BP agility. While the BPMS
was adopted and in use, agile BPs were not achieved at BigFin. Hence as the interviews
progressed, it was noted that the major factors identified were all negatively impacting
success. The eleven dominant factors are presented in Table 3 and the interlinked factors
are shown in Fig. 1.
The sources column represents the number of interviewees that made statements
related to each theme and the references column represents the numbers of statements
made by the interviewees. While the respondents referred to concerns with the BPMS
work system itself (its methods, technology and participants), these concerns were being
Factors Impacting Successful BPMS Adoption and Use 65
Fig. 1. Explanatory model of factors negatively impacting successful BPMS adoption and use.
driven by governance and cultural concerns and these in turn were driven by strate-
gic alignment concerns. Together these factors negatively impacted successful BPMS
adoption and use and BP agility.
While the organisation wanted to improve their BPM maturity, they did not have
a BPM strategic vision and the “business as usual” nature of process improvement
and BPM were not seen as strategic priorities. This resulted in misalignment between
business and IT teams and insufficient budget allocation for BPM. Both factors drove
resourcing constraints for BPM initiatives.
The organisation had a large legacy estate and this impacted the lack of agility
because of technical integration implications and because of governance decisions made
regarding changing legacy applications. These in turn were driven by their legacy system
strategy being misaligned with the BPM strategy for agility. Software development
methods had been put in place to ensure long term BP agility but these resulted in
slowing software development in the short term. This and the reluctance of business to
change and standardise resulted in bypassing some of these methods. Without a BPM
strategic vision, the culture of the organisation could not be changed.
Considering the factors negatively impacting BPMS adoption and use at BigFin
allowed us to propose a tentative explanatory model which the organisation could follow
to improve BPMS adoption and use. This more generalisable model is presented as a
framework in Fig. 2. While the Rosemann and vom Brocke core BPM model lists the
components needed for BPM success this model provides an explanatory contribution
66 A. Koopman and L. F. Seymour
to understanding improved BPMS success. The model notes that a clear BPM vision is
needed and it needs to be aligned with the organisation’s legacy application strategy. This
is needed to enable better budget and resource alignment for BPM; a BPM culture [19];
appropriate BPM governance structures and BPM aligned legacy and standardization
decisions. These four factors, in turn, will influence an improved BPMS work system.
5 Conclusion
While organisations adopt BPMS and SOA technology primarily to achieve BP agility,
in many cases this agility is not achieved. There has been a call by BP researchers for
studies highlighting issues organisations are facing. Hence this research took a systems
approach to looking at the adoption and use of a BPMS in a South African financial
services organisation that was struggling to achieve BP agility. The systems approach
to technology adoption sees technology adoption as a work system change. Hence a
BPMS is seen as merely a technology in the BPMS work system trying to achieve agile
business processes for the organisation’s customers. Researchers Rosemann and vom
Brocke have identified the six core elements of BPM and their framework was used as
a lens in classifying the factors found impacting successful BPMS adoption and use in
the organisation studied.
This case study described the frustrations the IT team was facing using the BPMS
solution. While standalone applications can be implemented quickly, as soon as integra-
tion with other applications increases, the level of agility within those process was found
to decrease. The inherent integrative nature of the BPMS solution and rigid nature of
Factors Impacting Successful BPMS Adoption and Use 67
integration with legacy applications made changing applications very time consuming.
Having insufficient resources and a user base that did not want to standardise or change
processes increased their frustrations. A work system’s view of the factors impacting
this frustration is shown in Fig. 2 and these also pointed to a generic explanatory model.
Implementing a BPMS without considering the strategic priorities and alignment as
well as governance and culture will result in frustration and a lack of agility. Hence the
usefulness of this framework to practitioners considering implementing a BPMS.
While this framework offers an explanatory model of how some factors negatively
impact successful BPMS adoption and use, it does have limitations. Firstly, the model
is not complete, interviewees were working in the IT function and a richer picture
and more factors could have been obtained if employees from business functions were
interviewed. Secondly, the study was cross sectional and a longitudinal study looking
at the stages the organisation went through would be much richer. Thirdly, in terms of
context generalisation, the case organisation was in financial services and therefore was
risk averse and had a large legacy estate. Hence the framework has a focus on considering
legacy applications and their strategy and a younger or more risk tolerant organisation
with more modern applications will experience less frustration and hence parts of the
model might not be relevant. Also, this model might not include factors important to an
organisation which has other complexities or contextual factors.
Therefore, we note that this model is incomplete and contextual and further studies,
particularly longitudinal studies, with other organisations and a broader user base will
be able to further refine it. The framework could also be tested quantitatively to confirm
the importance of the relevant factors.
References
1. Chen, Y., Wang, Y., Nevo, S., Jin, J.F., Wang, L.N., Chow, W.S.: IT capability and organi-
zational performance: the roles of business process agility and environmental factors. Eur. J.
Inf. Syst. 23, 326–342 (2014). https://doi.org/10.1057/ejis.2013.4
2. Kryvinska, N.: Building consistent formal specification for the service enterprise agility
foundation. J. Serv. Sci. Res. 4, 235–269 (2012). https://doi.org/10.1007/s12927-012-0010-5
3. van Oosterhout, M., Waarts, E., van Hillegersberg, J.: Change factors requiring agility and
implications for IT. Eur. J. Inf. Syst. 15, 132–145 (2017). https://doi.org/10.1057/palgrave.
ejis.3000601
4. Abramowicz, W., Filipowska, A., Kaczmarek, M., Kaczmarek, T.: Semantically Enhanced
Business Process Modeling Notation: Semantic Technologies for Business and Information
Systems Engineering, pp. 259–275. IGI Global, Hershey (2012)
5. Tallon, P.P.: Inside the adaptive enterprise: an information technology capabilities perspective
on business process agility. Inf. Technol. Manage. 9(1), 21–36 (2008)
6. Hill, J.B., Cantara, M., Kerremans, M., Plummer, B.C.: Magic Quadrant for Business Process
Management Suites. Gartner RAS Core Research Note G152906 (2007)
7. Harmon, P.: The scope and evolution of business process management. In: Brocke, J.,
Rosemann, M. (eds.) Handbook on Business Process Management 1, pp. 37–80. Springer,
Heidelberg (2015). https://doi.org/10.1007/978-3-642-00416-2_3
8. Heininger, R.: Requirements for business process management systems supporting business
process agility. In: Oppl, S., Fleischmann, A. (eds.) S-BPM ONE 2012. CCIS, vol. 284,
pp. 168–180. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29294-1_12
68 A. Koopman and L. F. Seymour
9. Kerremans, M., Miers, D., Dunie, R., Wong, J., Iijima, K., Vincent, P.: Magic Quadrant for
Business Process Management Suites. Gartner G00345694 (2019)
10. Kidwell, D.S., Blackwell, D.W., Sias, R.W., Whidbee, D.A.: Financial Institutions, Markets,
and Money. Wiley, New Jersey (2016)
11. Nijssen, M., Paauwe, J.: HRM in turbulent times: how to achieve organizational agility? Int.
J. Hum. Resour. Manage. 23, 3315–3335 (2012). https://doi.org/10.1080/09585192.2012.
689160
12. Di Francescomarino, C., Dijkman, R., Zdun, U. (eds.): BPM 2019. LNBIP, vol. 362, pp.
VII–X. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37453-2
13. Trkman, P.: The critical success factors of business process management. Int. J. Inf. Manage.
30, 125–134 (2010). https://doi.org/10.1016/j.ijinfomgt.2009.07.003
14. Frambach, R.T., Schillewaert, N.: Organizational innovation adoption: a multi-level frame-
work of determinants and opportunities for future research. J. Bus. Res. 55, 163–176 (2002).
https://doi.org/10.1016/s0148-2963(00)00152-1
15. Davison, R.M.W., Louie, H.M., Alter, S., Ou, C.: Adopted globally but unusable locally: what
workarounds reveal about adoption, resistance, compliance and non-compliance. Presented
at the ECIS (2019). https://aisel.aisnet.org/ecis2019_rp/19/
16. Alter, S.: A systems theory of IT innovation, adoption, and adaptation. In: ECIS (2018).
https://aisel.aisnet.org/ecis2018_rp/26
17. Niehaves, B., Poeppelbuss, J., Plattfaut, R., Becker, J.: BPM capability development – a matter
of contingencies. Bus. Process Manag. J. 20, 90–106 (2014). https://doi.org/10.1108/bpmj-
07-2012-0068
18. Thompson, G., Seymour, L.F., O’Donovan, B.: Towards a BPM success model: an analysis in
South African financial services organisations. In: Halpin, T., et al. (eds.) BPMDS/EMMSAD
2009. LNBIP, vol. 29, pp. 1–13. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-
642-01862-6_1
19. Rosemann, M., vom Brocke, J.: The six core elements of business process management. In:
vom Brocke, J., Rosemann, M. (eds.) Handbook on Business Process Management 1. IHIS,
pp. 105–122. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-642-45100-3_5
20. Lyytinen, K., Newman, M.: Explaining information systems change: a punctuated socio-
technical change model. Eur. J. Inf. Syst. 17, 589–613 (2017). https://doi.org/10.1057/ejis.
2008.50
21. Imanipour, N., Talebi, K., Rezazadeh, S.: Obstacles in business process management (BPM)
implementation and adoption in SMEs. SSRN Electron. J. (2012). https://doi.org/10.2139/
ssrn.1990609
22. vom Brocke, J., Petry, M., Schmiedel, T., Sonnenberg, C.: How organizational culture facili-
tates a global BPM project: the case of Hilti. In: vom Brocke, J., Rosemann, M. (eds.) Hand-
book on Business Process Management. IHIS, pp. 693–713. Springer, Heidelberg (2015).
https://doi.org/10.1007/978-3-642-45103-4_29
23. Kirchmer, M.: People enablement for process execution. In: High Performance Through
Business Process Management, pp. 67–80. Springer, Cham (2017). https://doi.org/10.1007/
978-3-319-51259-4_4
24. Schmiedel, T., vom Brocke, J., Recker, J.: Culture in business process management: how
cultural values determine BPM success. In: vom Brocke, J., Rosemann, M. (eds.) Handbook
on Business Process Management 2. IHIS, pp. 649–663. Springer, Heidelberg (2015). https://
doi.org/10.1007/978-3-642-45103-4_27
25. Bandara, W., Alibabaei, A., Aghdasi, M.: Means of achieving business process management
success factors. In: Proceedings of the 4th Mediterranean Conference on Information Systems.
Athens University of Economics and Business, Athens (2009)
26. Von Rosing, M., von Scheel, J., Gill, A.Q.: Applying Agile Principles to BPM. The Complete
Business Process Handbook. Morgan Kaufmann, Waltham (2015)
Factors Impacting Successful BPMS Adoption and Use 69
27. Marshall, M.N.: Sampling for qualitative research. Fam. Pract. 13, 522–526 (1996). https://
doi.org/10.1093/fampra/13.6.522
28. Attride-Stirling, J.: Thematic networks: an analytic tool for qualitative research. Qual. Res.
1, 385–405 (2016). https://doi.org/10.1177/146879410100100307
29. Alter, S.: Work system theory: overview of core concepts, extensions, and challenges for the
future. J. Assoc. Inf. Syst. 14, 72–121 (2013). https://doi.org/10.17705/1jais.00323
30. Rosemann, M.: The service portfolio of a BPM center of excellence. In: vom Brocke, J.,
Rosemann, M. (eds.) Handbook on Business Process Management 2, pp. 381–398. Springer,
Heidelberg (2010). https://doi.org/10.1007/978-3-642-01982-1_13
31. Awa, H.O., Ojiabo, O.U., Emecheta, B.C.: Integrating TAM, TPB and TOE frameworks and
expanding their characteristic constructs for e-commerce adoption by SMEs. J. Sci. Technol.
Policy 6, 76–94 (2015). https://doi.org/10.1108/jstpm-04-2014-0012
32. Santana, A.F.L., Alves, C.F., Santos, H.R.M., de Lima Cavalcanti Felix, A.: BPM governance:
an exploratory study in public organizations. In: Halpin, T., et al. (eds.) BPMDS/EMMSAD
-2011. LNBIP, vol. 81, pp. 46–60. Springer, Heidelberg (2011). https://doi.org/10.1007/978-
3-642-21759-3_4
33. Taudes, A., Feurstein, M., Mild, A.: Options analysis of software platform decisions: a case
study. MIS Q. 24, 227–243 (2000). https://doi.org/10.2307/3250937
34. Tumbas, S., Schmiedel, T.: Developing an organizational culture supportive of business
process management. In: Wirtschaftsinformatik, p. 115 (2013)
35. Tanriverdi, H., Konana, P., Ge, L.: The choice of sourcing mechanisms for business processes.
Inf. Syst. Res. 18, 280–299 (2007). https://doi.org/10.1287/isre.1070.0129
36. Andre, M., Baldoquin, M.G., Acuna, S.T.: Formal model for assigning human resources to
teams in software projects. Inform. Softw. Tech. 53, 259–275 (2011). https://doi.org/10.1016/
j.infsof.2010.11.011
37. Engwall, M., Jerbrant, A.: The resource allocation syndrome: the prime challenge of multi-
project management? Int. J. Project Manage. 21, 403–409 (2003). https://doi.org/10.1016/
s0263-7863(02)00113-8
38. Westland, J.: The Project Management Life Cycle: A Complete Step-by-step Methodology
for Initiating Planning Executing and Closing the Project. Kogan Page Publishers, London
(2007)
Chatting About Processes in Digital
Factories: A Model-Based Approach
1 Introduction
Given the dynamic nature of today’s business world, organizations should quickly
respond to a wide range of possible changes in their environment. The changes
can have either short-term effects, like disruptions in the supply or production
chain, or long-term effects, like changes in one part of the production system
that affect other subsystems and processes. Changes may also have an impact
on controlling and monitoring essential parameters, that refer to the situation in
which a production process is being executed. The introduction of data-driven
solutions and enabling technologies, to transform collected data into actionable
insights, has raised the complexity of monitoring and controlling activities. In
addition, different roles, including business managers, data analysts, machine
operators and designers, need to be informed on the processes and their changes
to perform their job. These workers: (a) need tools to find the right information
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 70–84, 2020.
https://doi.org/10.1007/978-3-030-49418-6_5
Chatting About Processes in Digital Factories 71
that are embedded inside a chatbot. These engines interact with each other to
deliver an engaging conversation with the user via control tables. The engines are
the Conversation Engine, the Interpretation Engine, and the Transition Engine,
and all of them are table-driven, following a data-driven approach [3].
The Conversation Engine allows a user “to speak to” (“to be spoken by”)
an application, taking into account the user’s profile to tailor the conversation.
The role of the Transition Engine is to control the progress on a pathway, i.e., a
sequence of content items for teaching. When a request is received, the Transition
Engine takes care of the transitions across content items, i.e., determining which
one is the most suitable “next item”. It is driven by a database (storing the
content items) and tables describing the “pathways” (defined by the author
of the contents) that allow traversing the content. A conversational interface
interacts with the user through different devices (e.g., mobile phone, tablet,
Alexa device) through a variety of media (e.g., text and videos).
The Interpretation Engine is responsible to maintain a number of parameters
describing the dynamic situation of the user (from cognitive, emotional, psycho-
logical points of view) and, more importantly, to interpret them properly. Using
the parameters, in fact, the engine should “interpret” the situation and instruct
the Transition Engine about what to do at the next step. The Interpretation
Engine is controlled via a set of rules stored in “interpretation tables”.
In addition, the Interface Component is responsible for technical jobs, like
managing the various input/output devices, managing possible media conver-
sions (e.g., text-to-speech, speech-to-text), transferring user turns to the conver-
sation engine, transferring “content items” and chatbot turns to the user.
the user controller to manage specific domain conversation; in case “iii” it passes
control and parameters to the interpretation engine for any further actions on
both state machine and learning path.
Starting from the reference state diagram in iCHAT [3], we built a new state
diagram in Fig. 2 to consider more situations in the conversation between a
human and a chatbot. The state diagram allows controlling the utterances that
the chatbot produces when reaching each state. This graph consists of a set
of states and transitions between them. The transition from one state to the
next depends on the user’s utterance. The state diagram allows controlling the
utterances that the chatbot produces when reaching each transition or a state.
This finite state machine is defined as a tuple, (Q, Σ, τ, q0 , F ) consisting of
a finite set of states Q, a finite set of input messages Σ, a transition function
τ : Q × Σ → Q, an initial state q0 ∈ Q, and final states F ⊆ Q. The transition
function here is a set of predefined data-driven rules in the conversation table
(Fig. 3a1 ) that activate a message for the chatbot.
A conversation goes on as a sequence of “turns” by the chatbot and the user.
According to the state machine in Fig. 2, a conversation can be paused (for a
short break and later resumed), suspended (for a long break and later resumed),
stopped (leaving the conversation), or completed (at the end of the pathway).
1
‘*’ in message ID and ‘?’ before the category name indicate enquiring from the user.
For readability, states and related transitions are separated by lines.
Chatting About Processes in Digital Factories 75
The number of consecutive turns and other chatbot’s behaviours2 are controlled
by tables.
The dialogue of the chatbot is designed by a set of categories. For each
state in Fig. 2, different sequences of categories are associated. There are some
dialogues associated to each state and transition. The dialogues are generated
by considering different categories regarding the situation. The sentence(s) for
each turn belongs to one out of seven different categories: ‘greetings, support,
preview (of content), summary (of what has been done), forecast (of what is
needed to complete the pathway), action (asking the user for taking action),
and reinforcement messages.
For example, in state “S00”, the chatbot performs a dialog from “greetings”
category; in transition “T3.1” the chatbot first says a “summary” message of
what is done by user and a “forecast” message for what the user has to learn next.
All information for generating the chatbot dialogues is derived from controlling
tables.
2
The behaviour can be customized on the basis of the user’s profile or context infor-
mation. The chatbot, for example, could take several turns or a few ones; the chatbot
could take long verbose turns, or could speak very briskly; the chatbot could use
different wording styles (e.g., “professional”, “friendly”, “soft”).
76 D. Rooein et al.
Fig. 3. Data-driven simulator for generating dialogues from tables for the chatbot
for optional ones, and purple for additional materials). These content items are
stored in table “Item DB” and organized in the learning pathways. Besides, users
can select the pathway topology most suitable in a given context, which is called
“adaptivity in the large”.
– Quick overview. The learner can go through the overview items of the
process. The goal is to improve the quality time of an employee to perceive
the outline of the production process. In this regard, just items with proper
metadata and tags are used to create a conversation on the highest abstrac-
tion of the process. The learning topology regarding to this pathway can be
expressed as a straight pathway as shown in Fig. 4a. A straight pathway con-
tains a list of items that have the nodes with “Overview” flavor type and the
role is “Core” that is demonstrated by blue color in the graph. The user can
go next through items to complete the learning path. Jumping back through
items is not possible in this pathway. The Transition Engine keeps a list of
items for each pathway in the straight pathway.
– Full learning. The full learning pathway shows more details about the pro-
cess by considering all optional nodes for each activity. Figure 4b is a straight
pathway with optional items and users have control to select optional items
or skip them. All blue items in the pathway are coming from “Core” role and
they are mandatory for the user to follow. The optional items are depicted
with green node color and the Transition Engine informs the chatbot if the
next item is optional and the chatbot gives the user a choice to select the
78 D. Rooein et al.
optional item or not during her learning path. If the user takes the items, the
Transition Engine marks that item “visited” otherwise “skipped”.
– Advanced learning. Advanced learning pathways declare a straight path-
way with core, optional and additional material items, which include all infor-
mation (e.g., process, sub-processes, and variables related to each process).
An example of this topology is shown in Fig. 4c. Additional material in item
DB are tagged by “Recommended” and they are purple in the graph. If a user
wants go deep on one item, this pathway structure is applicable. From the
learning path viewpoint, these recommended materials are for an advanced
learning experience. In a process, these additional materials can be inserted
by the author which is designing the pathway to add information about the
gateways, flow conditions, task data, and so on.
Transition Engine derives core items from “Item DB”. From the “Full Learning”
state, going to core or optional items is achievable, based on the underlying
structure of the pathway. From the “Advanced Learning” state, users can also
go further to a component of the process to retrieve recommended information
for available sub-processes and variables in a component.
muffins with yogurt, icing sugar on top, pink baking paper, and another client for
2 boxes of carrot muffins with yogurt, nothing on top, yellow baking paper, the
same dough can be used for both orders. Clearly, this scheduling service is based
on the number of (and capacity of each) dough mixers, the stream of received
orders, etc. The factory has a pool of dough mixers, of different capacity. The
fact that the number of different combinations is finite guarantees that such a
scheduling can be performed. When an order is received, in parallel to the dough
preparation, the baking paper should be set up as well. In addition to prepare a
set of the requested paper baking cases, a QR-code should be printed on each of
them and used as a unique identifier of the specific order. The identification of
the single muffin is crucial for customization. After the dough has been prepared,
the muffins are placed in the baking paper cases and sent to the oven (connected
to a QR code reader) for cooking. Muffins are cooked in batches of about 1,000
items and the length of this step is equal for all of them. After the baking has
been performed, the cart is operated in order to route the different muffins to
the right boxes, after putting the right topping, and then to the proper delivery
station. Depending on the order, different delivery agents can be used.
The objective of introducing the chatbot here is to create a conversation to
teach the process. As soon as a new employee arrives in the MyMuffin factory, the
chatbot takes care of the initial orientation briefing about the company process
components and all necessary information. Once the information regarding the
initial training, operational knowledge, and other processes are handled by the
chatbot, neither the employees and the trainer have to depend on each other. In
the MyMuffin use case, the semantic tag related to each learning item is defined
by four flavors (“Overview”, “Definition”, “Subdivision”, and “FAQ”) that are
dependent to the specific process domain and three roles (“Core”, “Optional”,
and “Recommended”) which they are general for any learning content related
to process navigation.
The conversation is driven by tables governing how the chatbot speaks and
how it understands the user’s utterances (Fig. 7). Below the tables a simple
example of a conversation between chatbot and user is shown. As shown in
Fig. 7, in state “S13” in the state machine, there are predefined categories for
the chatbot turn when resuming a suspended conversation. Each sentence from
the conversation table calls contents from “Item DB” table to create chatbot’s
dialogues, selecting items for the categories specified for the “S13” to create a
complete dialogue.
Chatting About Processes in Digital Factories 81
6 Related Work
During the last years, some attempts were made to ease the creation of chatbots,
aiming at querying data available in unstructured and structured formats.
Romero et al. [18] explored a set of application domains that can support
human to supervise cyber-physical systems in digital factories. According to the
authors in [16], these possible domains can be supported by using chatbots.
They introduced many use case scenarios, where softbots can bring proactive
insights over the production planners to optimize and support all operational
demands. The authors mentioned that in smart supply network softbots provide
monitoring (e.g., track and trace) to empower companies against any disruption
in a supply network. In [7], a chatbot was developed to aid new hires through
their onboarding process and reach related information needs as if it were a
human assistant.
82 D. Rooein et al.
In [15], a chatbot was constructed on top of some open data. Here the first
step is to extract plain text from documents stored as PDF files by employing
an optical character recognition (OCR) software. At this point, a set of possible
questions about the extracted contents were constructed using a “Overgenerating
Transformations and Rankings” algorithm, which was implemented using the
question generation framework presented in [9]. Finally, the matching patterns,
essential to the chatbot’s answering capability, are defined through Artificial
Intelligence Markup Language (AIML).
OntBot [1] employs a mapping technique to transform an ontology into a
relational database and then uses that knowledge to construct answers. The
main drawback of traditional chatbots, implemented for example through AIML
language, is the fact that the knowledge base has to be constructed ad-hoc
by handwriting thousands of possible responses. Instead of providing answers
by looking for a matching one inside a database, OntBot retrieves information
from the database, which will be then used to build up the response. Therefore,
likewise our solution, OntBot does not need to handwrite all the knowledge base
that stands behind the system.
In [10] authors declare that chatbots are the proper solution to provide per-
sonalized learning in MOOCs (Massive Open Online Courses). iMOOC [5] is a
novel methodology for designing customizable MOOCs that brings adaptivity
into the learning experience. iMOOC supports that various users of MOOCs
may have different purposes, some learners may want to get personalized con-
tent, not learning the entire material. Here a chatbot is introduced to support
the learner in choosing the most appropriate path. In the TEL (Technology-
Enhanced Learning) literature, since several years Intelligent Pedagogical Agents
(IPAs) are proposed as embodied intelligent agents designed for pedagogical pur-
poses to support learning. Chatbots as those ones proposed in this paper are
basically a form of IPA, and here we demonstrate their applicability to learning
manufacturing processes.
The dialogue management component of the chatbot is one of the key parts
in designing a conversational interface. In [14], dialogue management design and
management in industry is discussed. A dialogue manager for a goal-oriented
chatbot to conduct a proper conversation with user is illustrated in [11]. In [17],
a model based on a Finite-State Turn Taking Machine is introduced in order to
select an action any time.
Our aim is to propose a framework to query and teach processes. The interest
in the employment of chatbots in the context of Business Process Management
(BPM) is quite recent. Authors in [12] propose a way to take a business process
flow as input and to produce a Watson conversation model as output. Differently
from our approach, here authors focus on the execution of the process instead of
teaching or monitoring it. In [4] an approach is proposed to use chatbots to learn
business processes from input data. Here the focus is not on process participants.
Instead the final users are data analysts in charge of mining business processes.
In [13] the use of chatbots to help a process actor via conversation to perform
tasks of a process model is presented.
Chatting About Processes in Digital Factories 83
Acknowledgements. The work of Donya Rooein has been supported by EIT Digital
and IBM. The work of Francesco Leotta and Massimo Mecella has been supported by
the EU H2020-RISE project FIRST. The work of Devis Bianchini has been supported
by Smart4CPPS Lombardy Region project. This work expresses the opinions of the
authors and not necessarily those of the funding agencies and companies.
References
1. Al-Zubaide, H., Issa, A.A.: OntBot: ontology based chatbot. In: International Sym-
posium on Innovations in Information and Communications Technology, pp. 7–12.
IEEE (2011)
2. Bicocchi, N., Cabri, G., Mandreoli, F., Mecella, M.: Dynamic digital factories for
agile supply chains: an architectural approach. J. Ind. Inf. Integr. 15, 111–121
(2019)
3. Di Blas, N., Lodi, L., Paolini, P., Pernici, B., Renzi, F., Rooein, D.: A new approach
to conversational applications. In: SEBD, Data Driven Chatbots (2019)
4. Burattin, A.: Integrated, ubiquitous and collaborative process mining with chat
bots. In: 17th International Conference on Business Process Management, BPM
Demo Track, Vienna, Austria, pp. 144–148. CEUR-WS (2019)
5. Casola, S., Di Blas, N., Paolini, P., Pelagatti, G.: Designing and delivering MOOCs
that fit all sizes. In: Society for Information Technology & Teacher Education
International Conference, pp. 110–117. (AACE) (2018)
6. Catarci, T., Firmani, D., Leotta, F., Mandreoli, F., Mecella, M., Sapio, F.: A
conceptual architecture and model for smart manufacturing relying on service-
based digital twins. In: IEEE International Conference on Web Services, ICWS,
Milan, Italy, pp. 229–236 (2019)
7. Chandar, P., et al.: Leveraging conversational systems to assists new hires during
onboarding. In: Bernhaupt, R., Dalvi, G., Joshi, A., Balkrishan, D.K., O’Neill, J.,
Winckler, M. (eds.) INTERACT 2017. LNCS, vol. 10514, pp. 381–391. Springer,
Cham (2017). https://doi.org/10.1007/978-3-319-67684-5 23
8. Doan, A.: Human-in-the-loop data analysis: a personal perspective. In: ACM Inter-
national Workshop on Human-In-the-Loop Data Analysis (HILDA 2018), pp. 1–6
(2018)
9. Heilman, M., Smith, N.A.: Question generation via overgenerating transforma-
tions and ranking. Technical report, Carnegie Mellon University Pittsburgh PA
Language Technologies Institute (2009)
84 D. Rooein et al.
10. Holotescu, C.: MOOCbuddy: a chatbot for personalized learning with MOOCs. In:
Iftene, A., Vanderdonckt, J. (eds.) RoCHI-International Conference on Human-
Computer Interaction, pp. 91–94, Bucarest (2016)
11. Ilievski, V., Musat, C., Hossmann, A., Baeriswyl, M.: Goal-oriented chat-
bot dialog management bootstrapping with transfer learning. arXiv preprint
arXiv:1802.00500 (2018)
12. Kalia, A.K., Telang, P.R., Xiao, J., Vukovic, M.: Quark: a methodology to trans-
form people-driven processes to chatbot services. In: Maximilien, M., Vallecillo, A.,
Wang, J., Oriol, M. (eds.) ICSOC 2017. LNCS, vol. 10601, pp. 53–61. Springer,
Cham (2017). https://doi.org/10.1007/978-3-319-69035-3 4
13. López, A., Sànchez-Ferreres, J., Carmona, J., Padró, L.: From process models to
chatbots. In: Giorgini, P., Weber, B. (eds.) CAiSE 2019. LNCS, vol. 11483, pp.
383–398. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21290-2 24
14. Paek, T., Pieraccini, R.: Automating spoken dialogue management design using
machine learning: an industry perspective. Speech Commun. 50(8–9), 716–729
(2008)
15. Pichponreay, L., Kim, J.-H., Choi, C.-H., Lee, K.-H., Cho, W.-S.: Smart answering
chatbot based on OCR and overgenerating transformations and ranking. In: 8th
International Conference on Ubiquitous and Future Networks (ICUFN), pp. 1002–
1005. IEEE (2016)
16. Rabelo, R.J., Romero, D., Zambiasi, S.P.: Softbots supporting the operator 4.0 at
smart factory environments. In: Moon, I., Lee, G.M., Park, J., Kiritsis, D., von
Cieminski, G. (eds.) APMS 2018. IAICT, vol. 536, pp. 456–464. Springer, Cham
(2018). https://doi.org/10.1007/978-3-319-99707-0 57
17. Raux, A., Eskenazi, M.: A finite-state turn-taking model for spoken dialog systems.
In: Proceedings of Human Language Technologies: The 2009 Annual Conference
of the North American Chapter of the Association for Computational Linguistics,
pp. 629–637 (2009)
18. Romero, D., et al.: Towards an operator 4.0 typology: a human-centric perspective
on the fourth industrial revolution technologies. In: International Conference on
Computers and Industrial Engineering (CIE46) (2016)
Enforcing a Cross-Organizational
Workflow: An Experience Report
1 Introduction
Globalization, novel technologies, fast-changing environments, the need for cost
reduction, and the rapid evolution of information and communication tech-
nologies introduce challenges for enterprises, which are tackled by the cross-
organizational collaboration between organizations, as stated in [9]. Establish-
ing such cross-organizational collaboration requires the involved enterprises to
specify the different interoperations and a common goal beforehand. This can
be done by modelling these cooperative processes.
Already in 1998, before emergence of mature business process modelling stan-
dards, like BPMN, van der Aalst suggested how inter-organizational workflows
could be modelled and verified. In [18], he describes the concept of loosely cou-
pled workflow processes, which use asynchronous communication. A workflow
consists of one or more processes, each one assigned to a given organization. The
processes operate independently from each other but synchronize at specified
points to ensure the correct execution of the whole workflow. However, in the
same paper van der Aalst did not provide any details on how such an approach
can be implemented in practice.
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 85–98, 2020.
https://doi.org/10.1007/978-3-030-49418-6_6
86 S. Stahnke et al.
Particular industrial plants, designed and built under the global supervision
of a company called the EPC (abbreviation for Engineering, Procurement and
Construction), require a certification by a governmental entity. In this report we
take a detailed look at the certification process for a power plant, which is also
valid in other types of industrial construction. The certification process requires
the governmental body to issue approvals throughout the different stages of
the power plant functional safety life cycle (consisting of design, fabrication,
mounting, commissioning and test approval). In Fig. 1 we illustrate high-level
phases of the life cycle.
In each step of the process, multiple documents are sent to the governmental
body, that checks the documents, if needed asks for revisions and eventually
Enforcing a Cross-Organizational Workflow: An Experience Report 87
Fig. 1. High-level phases of the industrial plant functional safety life cycle
issues a certificate with its statement. This process includes different parties
with different goals. Therefore, various steps of the process should be secured, to
assure to all parties that the others execute the workflow correctly, and therefore,
certain invariants (safety properties in the sense of temporal logic) hold. In case
one of those invariants does not hold, all parties should be able to ascertain the
responsible entity. Digitalizing this certification process results in an acceleration
of the overall certification process, a secure management of all versions of the
published documents, an easy search through the documents, a clear overview
of all requirements and tasks during the entire plant life cycle, and finally, a
paperwork reduction.
The following stakeholders are involved in the certification process. The
Owner issues the plant, checks that the participating parties proceed correctly,
and gives its approval throughout the entire life cycle. EPC 1 represents the ser-
vice provider, which designs and builds the industrial plant. In our case, the
EPC was represented by Siemens AG. NOBO, short for Notified Body, is a gov-
ernmental agency responsible for examining and eventually approving the result
of different stages of the construction of the industrial plant throughout the
overall plant life cycle. In our case, the participating NOBO was TÜV Rhein-
land. Manufacturer produces parts of a component and delivers them to EPC.
The certification process, described in this experience report, is an industrial
standard.
1
https://www.marquard-bahls.com/de/news-info/glossar/detail/term/epc-engineer
ing-procurement-and-construction.html.
88 S. Stahnke et al.
3 Background
3.1 Modelling Inter-organizational Workflows
2
http://yawlfoundation.org/.
3
http://yoann.nogues.free.fr/IMG/pdf/07-04 WP Intro to BPMN - White-2.pdf.
Enforcing a Cross-Organizational Workflow: An Experience Report 89
In order to digitalize the certification process for the construction of power plants,
the following steps are performed (Fig. 2). First, the use case is modelled using
a Workflow Modelling Language. In the second step, the derived models are
translated into one coherent model to verify its correctness. Third, the model is
translated into Smart Contracts in Hyperledger Fabric, and in the fourth and
last step, the Smart Contract functionality is validated in regard to the model,
which was verified in the second step.
The first step for the digitalization of the certification process was modelling
the workflow. Four workshops each with three representatives from the involved
organizations are performed; two domain experts from the company, in charge
of the design and construction of the power plant, and one expert from a govern-
mental authority. We, as researchers and facilitators, analysed several aspects of
the process together with the domain experts: the different stages of the safety
life cycle, the involved parties, the documents to be certified by the governmental
90 S. Stahnke et al.
entity, and the overall workflow. Since this initial modelling phase required par-
ticipation of domain experts, who are not IT experts, ordinary P/T-Nets were
used to make sure the models are easy to understand and adapt.
Dealing with an inter-organizational business process, van der Aalst’s method
was used for modelling and verifying this type of workflows; the concept of
loosely coupled processes, which use asynchronous communication. This method
is described in [18], and explained in more detail in [5]. Using this concept, shared
places were introduced to the P/T-Nets. This can be seen as an assumption-
commitment specification of a contract, which defines the interfaces between the
participating parties. In the later implementation only those interfaces are known
by everyone, internal business logic stays secret. Therefore, the confidentiality of
internal states and processes by all participants is guaranteed, but nevertheless
there is an understanding that a certain activity has been done but not how
it has been done. Additionally, special operators (i.e. XOR-Joins/Splits) were
added to enable decision-making in the Net.
Since the tasks of the certification process of a power plant are bound to doc-
uments, only the publication of the so-called documentation packages were mod-
elled. The tool WoPeD was used to model the different stages of the certification
process in ordinary P/T-Nets with special operators and shared places exten-
sion. After two workshops with the domain experts two meta-models emerged.
Those meta-models were applied to all models of the different stages of the
life cycle. This resulted in 7 models (Plant Design, System Design, Component
Design, Fabrication, Mounting, Commissioning, Test Approval), 200 places and
83 transitions. Since a plant consists of multiple systems, which consist of mul-
tiple components, and each component is composed of several parts, an overall
model was introduced to represent the parallel execution of designing the systems
and components, manufacturing the different parts of a component, mounting
the components, running each system, and finally combining the systems and
running the whole plant.
The first state space, calculated from the finished CPN, was infinite and there-
fore, only the partial state space was presented in the first report (Fig. 4). The
state space was calculated for ten minutes, which resulted in a state space with
almost 300000 nodes. Four unbound places (Fig. 3) and over 500 dead-markings
existed.
6000
5000
4000
3000
2000
1000
0
Unbound places
CAT'IniƟal_state EPC'EPC_wait
NOBO'Wait_NOBO NOBO'Wait_for_documentaƟon
Fixing the discovered unbound places, resulted in a finite state space with
186 nodes, 225 arcs and two dead markings (final report in Fig. 4). Analyzing the
generated state space for the before defined properties resulted in the following:
the CPN model terminates, is deadlock-free, holds no unbound places, and is
therefore bound. The Net is not sound, since the state space contains two dead
markings. Since both markings are valid end states, only distinguished by the
placement of tokens in the final places, no further adjustments were made to the
model.
269866
300000 229798
250000
200000
150000
225 Arcs
100000
186 Nodes
50000
5394 2 Dead markings
0
First report Final report
break their agreements and therefore, serve as a machine enforceable and ver-
ifiable protocol for inter-organizational workflows. Implementing the business
process as Smart Contracts on a Blockchain enforces all involved parties to exe-
cute the workflow correctly and therefore, establishes trust between all workflow
participants without requiring a trusted third party. The following requirements
were considered in the choice of a suitable Blockchain platform. All partici-
pants of the network must be identifiable at any time. Unauthorized access of
non-involved organizations must be prevented. Privacy and confidentiality of
transactions and data must be guaranteed at all times. The private permis-
sioned Blockchain platform Hyperledger Fabric supports all before mentioned
requirements, and is therefore used to implement the use case.
The CPN model was manually translated into Chaincode using Golang and
the IBM Blockchain Platform 4 extension in Visual Studio Code. Interfaces,
involving the publication of documents, were extracted from the CPN model
and the surrounding transitions were examined. Internal places and transitions,
framing the interfaces, were merged to one transition. Figure 5 displays the orig-
inal part of the CPN model covering the publication of the Plant Design docu-
mentation package by EPC.
The internal places and transitions (creating, processing and publishing the
documentation package) were merged to one atomic transition (Fig. 6). This
transition is only representing the publication of the documentation package
itself. This ensures that the internal processes and states of each participant
in the workflow, and later in the Blockchain, are kept secret. The transition
is surrounded by incoming/outgoing places, constituting the before extracted
interfaces. The incoming places are named preconditions, whereas the outgoing
4
https://marketplace.visualstudio.com/items?itemName=IBMBlockchain.ibm-block
chain-platform.
Enforcing a Cross-Organizational Workflow: An Experience Report 93
Fig. 5. Part of the CPN model covering the publication of the plant design
places are postconditions, see Fig. 6. All pre- and postconditions are represented
as interfaces in the CPN model. In the end, all interfaces constituting pub-
lished documents were represented in the Smart Contracts, which determined a
sequence of tasks.
To evaluate that the Smart Contracts implement the workflow correctly, the
token game was run on the model, which resulted in a set of executed tasks,
together with their initial and current, local markings. These tuples were put
into logical orders:
– Prerequisites:
publish(F easibilityStudy) → SU CCESS
– Positive test case:
published(F easibilityStudy),
publish(P lantDesign) → SU CCESS
– Negative test case:
¬published(F easibilityStudy),
publish(P lantDesign) → ERROR
The resulting orders were manually translated into unit tests, which were
run against the Smart Contracts. The partial state of the Net should be equal
to the current state of the ledger.
Formulating unit tests from the extracted logic resulted in 87 tests with a
test coverage of 44.4%. To cover also simple checks for wrongly inserted (num-
ber of) arguments, additional unit tests were formulated. A regression test was
written to cover the overall execution. The written tests cover 56.6% of all state-
ments. Certain error cases (i.e. JSON encoding/decoding failures, errors while
getting/putting Document structures from/on the ledger) were not considered
in the unit tests, which explains the low test coverage.
Step 1: Modelling process Step 2: Smart contract and unit test implementation
Domain
experts is translated into
Coloured Petri
Smart Contract
Software Net Software
engineer engineer
is run on is run on
Sub- translating
formulating
contractors/ into
Token game Logic Unit tests
Governmental
authorities
results in result in
equal to
Current state of
Current marking
the ledger
nately, a model often cannot represent every detail of the workflow. Therefore,
information that is not captured by the model has to be recorded in textual
form, can not be verified and has to be considered in the implementation and
testing phase additionally.
In case of a change in the model, the Smart Contracts are easily adaptable.
Backwards compatibility is not guaranteed. Also, once the Smart Contracts are
running, they can’t be updated, since in our case keys to get earlier performed
transactions from the ledger are saved locally by each node in the Blockchain
network and depend on the currently running Smart Contracts. This approach
does not force all participants to use the same SDK, but gives each stakeholder
the flexibility to use any technology they want behind the API.
6 Related Work
In [4], the authors demonstrate the use of Blockchain technology in the context
of inter-organizational business processes for the use case of a documentary letter
of credit. Unfortunately, no method was introduced to specify a workflow. Since
we are dealing with a large workflow, a method to specify the workflow and
translate this specification into Smart Contracts is necessary. Routis investigates
the use of Case Management Modelling and Notation for collaborative processes
[15]. A method on how to translate the defined processes in Smart Contracts
is not given. The authors in [11] introduce the notion of Enforceable Business
Processes (EBP), a cooperative stable process, managed by multiple mutually
independent organizations with a common goal. Challenges regarding running
EBPs as Smart Contracts are analysed. A method on how to map such processes
onto Smart Contracts is not introduced.
Caterpillar [10] and Lorikeet [17] translate BPMN to Solidity. Caterpillar
is an open-source Business Process Management System running on Ethereum.
Models in BPMN are translated to Solidity with the compilation tools provided
by Caterpillar. Furthermore, the execution engine enables the deployment of
the generated Smart Contract by the compilation tools, as stated in [10]. Lori-
keet is a model-driven engineering tool to implement business processes on a
Blockchain, described in [17]. The tools create Solidity Smart Contract code
from BPMN specifications, which is demonstrated by an industrial use case.
These tools restrict the modelling of the business process to BPMN. Since we
use Petri Nets to model our use case, because of the above mentioned advantages,
those systems are not applicable.
Nakamura presents in [1] a similar approach to ours. Defining the cross-
organizational business processes in statecharts, applying statechart reduction
algorithms to optimize the size of the statechart and generating software arti-
facts, Smart Contracts, running on Blockchain. Since we are dealing with an
inter-organizational business process that comprises obligations which are agreed
upon by all parties beforehand, a reduction of the size of the specified workflow,
and therefore, a possible loss of information, is not permitted. Furthermore, stat-
echarts lack formal semantics and therefore, can not be formally verified. The
proposed solution in [1] does not include a verification of the statecharts.
Enforcing a Cross-Organizational Workflow: An Experience Report 97
7 Conclusion
Today business processes often exceed organizational boundaries and therefore,
trust between the organizations that participate in this process has to be estab-
lished. In the past, this trust was guaranteed solely through legal contracts. To
digitally establish trust without requiring a trust third party, Blockchain tech-
nology together with Smart Contracts is used to ensure that involved organiza-
tions can not break their agreements. Smart Contracts can serve as a machine
enforceable and verifiable protocol for inter-organizational workflows. This expe-
rience report introduced a solution on how to model, verify and implement an
enforcement of cross-organizational business processes. This was done on the
basis of the use case “Certification of the Construction of Industrial Plants”.
Using different types of Petri Nets and verifying the resulted models in CPN
Tools, assured the correct representation of the workflow. The verified models
were then translated into Smart Contracts, which were evaluated based on the
simulation and reachability graph of the models.
To enhance the outcome of this report, the following points could be consid-
ered in the future. Building a generator that generates Smart Contracts from
Petri Nets, Coloured Petri Nets in particular, could improve the cost for the
method introduced in this report. Furthermore, versioning and updating the
Smart Contracts have to be improved, and the admissibility of the ledger and
the published documents as evidence in a court of law should be researched.
Acknowledgement. - This research work has been partially funded by the Euro-
pean Union’s H2020 Programme under the Grant Agreement No. 830929 (Cyber-
Sec4Europe). Also, we would like to thank our colleagues from Siemens and TÜV
who were involved.
References
1. B, S.M.G., Yakhchi, S., Beheshti, A.: Inter-organizational Business Processes
Managed by Blockchain, pp. 161–177 (2018). https://doi.org/10.1007/978-3-030-
02922-7, https://doi.org/10.1007/978-3-030-02922-7 11
2. Christidis, K., Devetsikiotis, M.: Blockchains and smart contracts for the Internet
of Things. IEEE Access 4, 2292–2303 (2016). https://doi.org/10.1109/ACCESS.
2016.2566339
3. Dijkman, R.M., Dumas, M., Ouyang, C.: Formal semantics and analysis of BPMN
process models using petri nets. Technical Report, Prepr. 7115, 1–30 (2007). http://
eprints.qut.edu.au/7115/
4. Fridgen, G., Radszuwill, S., Urbach, N., Utz, L.: Cross-organizational workflow
management using blockchain technology - towards applicability, auditability, and
automation. In: Proceedings of 51st Hawaii International Conference System Sci-
ence, vol. 4801 (2018). https://doi.org/10.24251/hicss.2018.444
5. Heckel, R.: Open petri nets as semantic model for workflow integration. In:
Ehrig, H., Reisig, W., Rozenberg, G., Weber, H. (eds.) Petri Net Technology for
Communication-Based Systems. LNCS, vol. 2472, pp. 281–294. Springer, Heidel-
berg (2003). https://doi.org/10.1007/978-3-540-40022-6 14
98 S. Stahnke et al.
1 Introduction
In the last decades, the business process management (BPM) community has
established approaches and tools to design, enact, control, and analyze business
processes. Most process management systems follow predefined process models
that capture different ways to coordinate their tasks to achieve their business
goals. However, not all types of processes can be predefined at design time—
some of them can only be specified at run time because of their high degree of
uncertainty [18]. This is the case with Knowledge-intensive Processes (KiPs).
This work is partially supported by CAPES and CNPq scholarships, by the Mobility
Program of the Santander Bank, and by the São Paulo Research Foundation (FAPESP)
with grants #2017/21773-9 and #2019/02144-6. The opinions expressed in this work
do not necessarily reflect those of the funding agencies.
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 101–116, 2020.
https://doi.org/10.1007/978-3-030-49418-6_7
102 S. K. Venero et al.
KiPs are business processes with critical decision-making tasks that involve
domain-specific knowledge, information, and data [4]. KiPs can be found in
domains like healthcare, emergency management, project coordination, and case
management, among others. KiP structure depends on the current situation and
new emergent events that are unpredictable and vary in every process instance
[4]. Thus, a KiP’s structure is defined step by step as the process executes, by a
series of decisions made by process participants considering the current specific
situations and contexts [13]. In this sense, it is not possible to entirely define
beforehand which activities will execute or their ordering and, indeed, it is nec-
essary to refine them as soon as new information becomes available or whenever
new goals are set.
These kinds of processes heavily rely on highly qualified and trained profes-
sionals called knowledge workers. Knowledge workers use their own experience
and expertise to make complex decisions to model the process and achieve busi-
ness goals [3]. Despite their expertise, it is often the case that knowledge workers
become overwhelmed with the number of cases, the differences between cases,
rapidly changing contexts, and the need to integrate new information. They
therefore require computer-aided support to help them manage these difficult
and error-prone tasks.
In this paper, we explore how to provide this support by considering the pro-
cess modeling problem as an automated planning problem. Automated planning,
a branch of artificial intelligence, investigates how to search through a space of
possible actions and environment conditions to produce a sequence of actions
to achieve some goal over time [10]. Our work investigates an automated way
to generate process models for KiPs by mapping an artifact-centric case model
into a planning model at run time. To encode the planning domain and planning
problem, we use a case model defined according to the METAKIP metamodel
[20] that encloses data and process logic into domain artifacts. It defines data-
driven activities in the form of tactic templates. Each tactic aims to achieve a
goal and the planning model is derived from it.
In our approach, we use Markov decision processes (MDP) because they allow
us to model dynamic systems under uncertainty [7], although our definition of
the planning problem model enables using different planning algorithms and
techniques. MDP finds optimal solutions to sequential and stochastic decision
problems. As the system model evolves probabilistically, an action is taken based
on the observed condition or state and a reward or cost is gained [7,10]. Thus,
an MDP model allows us to identify decision alternatives for structuring KiPs
at run time. We use PRISM [11], a probabilistic model checker, to implement
the solution for the MDP model.
We present a proof of concept by applying our method in a medical treatment
scenario, which is a typical example of a non-deterministic process. Medical
treatments can be seen as sequential decisions in an uncertain environment.
Medical decisions not only depend on the current state of the patient, but they
are affected by the evolution of the states as well. The evolution of the patient
state is unpredictable, since it depends on factors such as preexisting patient
Automated Planning for Supporting Knowledge-Intensive Processes 103
2 Motivating Example
This section presents a motivating medical case scenario. Suppose we have the
following medical scenario in the oncology department stored in the Electronic
Medical Record (EMR).
Mary, 58 years old, married, two children. She was diagnosed with a lymphoma non-
Hodgkin admitted on 20/07/2019 and is receiving R-ICE Chemotherapy. R-ICE is
named after the initials of the drugs used: rituximab, ifosfamide, carboplatin, etopo-
side. R-ICE is applied as a course of several sessions (cycles) of treatment over a
few months. On 02/10/2019, Mary is supposed to receive the second cycle of R-ICE.
However, on admission, she is febrile at 38 ◦ C and presents severe nausea (Level 4).
treatments and patient histories. Support is therefore needed that can take into
account patient data, constraints, dependencies, and patient/doctor preferences
to help advise the doctor on viable and effective courses of treatment.
Table 1. Tactics templates for fever (Fvr) and nausea (Nausea) management
3 Background
This section presents the underlying concepts in our proposal. Section 3.1 pro-
vides an overview of the METAKIP metamodel; Sect. 3.2 introduces basic con-
cepts of automated planning; Sect. 3.3 explains Markov decision process (MDP).
Section 3.4 describes the PRISM tool and language.
Fig. 1. (a) MDP representation [10] and (b) Example syntax of mdp PRISM [11]
module and rewards
3.4 PRISM
PRISM [11] is a probabilistic model checker that allows the modeling and
analysis of systems that exhibit probabilistic behavior. The PRISM tool pro-
vides support for modeling and construction of many types of probabilistic
models: discrete-time Markov chains (DTMCs), continuous-time Markov chains
(CTMCs), Markov decision processes (MDPs), and probabilistic timed automata
(PTAs). The tool supports statistical model checking, confidence-level approx-
imation, and acceptance sampling with its discrete-event simulator. For non-
deterministic models it can generate an optimal adversary/strategy to reach a
certain state.
Models are described using the PRISM language, a simple, state-based lan-
guage based on the reactive modules formalism [1]. Figure 1(b) presents an exam-
ple of the syntax of a PRISM module and rewards. The fundamental compo-
nents of the PRISM language are modules. A module has two parts: variables
and commands. Variables describe the possible states that the module can be
in at a given time. Commands describe the behavior of a module, how the state
changes over time. A command comprises a guard and one or more updates. The
guard is a predicate over all the variables in the model. Each update describes
a transition that the module can take if the guard is true. A transition is speci-
fied by giving the new values of the variables in the module. Each update has a
probability which will be assigned to the corresponding transition. Commands
can be labeled with actions. These actions are used for synchronization between
modules. Cost and rewards are expressed as real values associated with certain
states or transitions of the model.
The state of a case is the set of values (available data) of the attributes con-
tained in artifacts of the context and the environment. However, since the number
of attributes of the artifacts is very large, it is necessary to limit the number of
attributes to only the most relevant ones, which determines the current state of
the case at a given time t.
At this point, we are able to define the planning problem to generate a plan
as a process model.
GSt serves as an input for searching an execution path for a specific situation.
Different goal states can be defined over time.
Our problem definition enables the use of different planning algorithms and
the application of automatic planning tools to generate alternatives plans. As
we are interested in KiPs, which are highly unpredictable processes, we use
Markov Decision Processes for formulating the model for the planner. MDPs
allows us to represent uncertainty with a probability distribution. MDP makes
sequential decision making and reasons about the future sequence of actions
and obstructions, which provides us with high levels of flexibility in the process
models. In the following, we show how to derive an MDP model expressed in the
PRISM language from a METAKIP model automatically.
For finding the set of relevant planning operators ROt , first, we select tactics
whose preconditions must be satisfied by the current situation OSt and whose
goal is related to the target state GSt . This can be done by calculating the
percentages of both the satisfied preconditions and achievable goals. If these
percentages are within an acceptable range according to the rules of the domain,
the tactics are selected. Second, this first set of tactics is shown to the knowledge
workers who select the most relevant tactics. The set of the selected relevant
tactics is denoted as RT . From this set of tactics, we verify which activities
inside the tactics are available at time t. Thus, the set of available actions at
time t is denoted by At = a1 , a2 , . . . , an . Finally, the relevant planning operators,
ROt , are created by means of At .
5 Proof of Concept
This section formulates a patient-specific MDP model in PRISM for the medical
scenario presented in Sect. 2. In the area of health care, medical decisions can
be modeled with Markov Decisions Processes (MDP) [5,17]. Although MDP is
more suitable for certain types of problems involving complex decisions, such
as liver transplants, HIV, diabetes, and others, almost every medical decision
can be modeled as an MDP [5]. We generate the PRISM model by defining
the observable situation OSt , Goal state GSt , and the set of relevant planning
operators ROt .
Fig. 2. Plan for reaching the goal state optimizing the cost
according to the state actually reached after activity execution. Further studies
are necessary to help guiding knowledge workers in interpreting and following
the model.
actions, new emerging goals and information, which provides high levels of flex-
ibility and adaptation. As we describe a generic planning model, it is possible
to use different planning algorithms or combine other planning models, such as
the classical planning model or the hierarchical task network (HTN), accord-
ing to the structuring level of the processes at different moments. Thereby, we
could apply this methodology to other types of processes, from well-structured
processes to loosely or unstructured processes.
Our approach relies on MDP, which requires defining transition probabilities,
which in some situations can be very difficult and expensive to get. Nowadays a
huge amount of data is produced by many sensors, machines, software systems,
etc, which might facilitate the acquisition of data to estimate these transition
probabilities. In the medical domain, the increasing use of electronic medical
record systems shall provide the medical data from thousands of patients, which
can be exploited to derive these probabilities. A limitation in MDPs refers to the
size of the problem because the size of the state-space explodes, and it becomes
more difficult to solve. In this context, several techniques for finding approximate
solutions to MDPs can be applied in addition to taking advantage of the rapid
increase of processing power in the last years.
Flexible processes could be easily designed if we replan after an activity exe-
cution. In fact, our approach suggests a system that has a constant interleaving of
planning, execution, and monitoring. In this way, it will help knowledge workers
during the decision-making process.
7 Conclusion
References
1. Alur, R., Henzinger, T.A.: Reactive modules. Form. Methods Syst. Des. 15(1),
7–48 (1999)
2. Butcher, H.K., Bulechek, G.M., Dochterman, J.M.M., Wagner, C.: Nursing Inter-
ventions classification (NIC)-E-Book. Elsevier Health Sciences (2018)
3. Davenport, T.: Thinking for a Living. How to Get Better Performance and Results.
Harvard Business School Press, Boston (2005)
4. Di Ciccio, C., Marrella, A., Russo, A.: Knowledge-intensive processes: characteris-
tics, requirements and analysis of contemporary approaches. J. Data Semant. 4(1),
29–57 (2015)
5. Dıez, F., Palacios, M., Arias, M.: MDPs in medicine: opportunities and challenges.
In: Decision Making in Partially Observable, Uncertain Worlds: Exploring Insights
from Multiple Communities (IJCAI Workshop), vol. 9, p. 14 (2011)
6. Ferreira, H.M., Ferreira, D.R.: An integrated life cycle for workflow management
based on learning and planning. Int. J. Cooper. Inf. Syst. 15(04), 485–505 (2006)
7. Ghallab, M., Nau, D., Traverso, P.: Automated Planning: Theory and Practice.
Elsevier (2004)
8. Henneberger, M., Heinrich, B., Lautenbacher, F., Bauer, B.: Semantic-based plan-
ning of process models. In: Multikonferenz Wirtschaftsinformatik (MKWI). GITO-
Verlag (2008)
9. Hull, R., Motahari Nezhad, H.R.: Rethinking BPM in a cognitive world: trans-
forming how we learn and perform business processes. In: La Rosa, M., Loos, P.,
Pastor, O. (eds.) BPM 2016. LNCS, vol. 9850, pp. 3–19. Springer, Cham (2016).
https://doi.org/10.1007/978-3-319-45348-4 1
10. Kochenderfer, M.J.: Decision Making Under Uncertainty: Theory and Application.
MIT press, Cambridge (2015)
11. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic
real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 585–591. Springer, Heidelberg (2011). https://doi.org/10.1007/978-
3-642-22110-1 47
12. Laurent, Y., Bendraou, R., Baarir, S., Gervais, M.P.: Planning for declarative pro-
cesses. In: Proceedings of the 29th Annual ACM Symposium on Applied Comput-
ing, pp. 1126–1133. ACM (2014)
13. Marjanovic, O.: Towards is supported coordination in emergent business processes.
Bus. Process Manag. J. 11(5), 476–487 (2005)
14. Marrella, A.: Automated planning for business process management. J. Data
Seman. 8(2), 79–98 (2019)
15. Marrella, A., Lespérance, Y.: A planning approach to the automated synthesis of
template-based process models. SOCA 11(4), 367–392 (2017)
16. Marrella, A., Mecella, M., Sardina, S.: SmartPM: an adaptive process management
system through situation calculus, IndiGolog, and classical planning. In: Proceed-
ings of the Fourteenth International Conference on Principles of Knowledge Rep-
resentation and Reasoning (KR 2014), pp. 518–527 (2014)
116 S. K. Venero et al.
17. Mattila, R., Siika, A., Roy, J., Wahlberg, B.: A Markov decision process model
to guide treatment of abdominal aortic aneurysms. In: 2016 IEEE Conference on
Control Applications (CCA), pp. 436–441. IEEE (2016)
18. Reichert, M., Weber, B.: Enabling Flexibility in Process-Aware Information Sys-
tems: Challenges, Methods Technologies. Springer, Heidelberg (2012)
19. Venero, S.K.: DW-SAAArch: a reference architecture for dynamic self-adaptation
in workflows. Master’s Thesis, UNICAMP, Campinas, Brazil (2015)
20. Venero, S.K., Dos Reis, J.C., Montecchi, L., Rubira, C.M.F.: Towards a metamodel
for supporting decisions in knowledge-intensive processes. In: Proceedings of the
34th ACM/SIGAPP Symposium on Applied Computing, pp. 75–84. ACM (2019)
Scheduling Processes Without Sudden
Termination
1 Introduction
Modeling and verification of the temporal aspects of a business process are crucial
for process management. Modeling temporal aspects includes defining deadlines,
durations, and other temporal constraints [5]. Verification of the temporal qual-
ities aims at determining, whether a given process model meets certain quality
criteria, in particular, whether time failures can be avoided by defining adequate
schedules for the dispatching of activities.
In recent years, there has been increasing awareness on the distinction
between activities whose duration is under the control of an agent, and activi-
ties whose duration cannot be controlled, but merely observed at run time [22].
These uncontrollable durations are called contingent. A good example for activi-
ties with contingent durations is bank money transfers within the EU, which are
guaranteed to take between 1 and 4 working days, but the client cannot control
the actual duration. In a similar way, a service contract might covenant a visit
by a technician within 24 hours but it is not controllable by the client when the
technician will actually appear.
The existence of contingent activities in processes led to the formulation of
dynamic controllability [7,19,22] as preferred criterion for temporal correctness
of processes. Dynamic controllability requires the existence of a dynamic schedule
(execution strategy), which assigns timestamps for starting and finishing activ-
ities in a reactive manner in response to the observation of the actual durations
of contingent activities at run time. Dynamic controllability is the most relaxed
notion for guaranteeing that a process controller is able to steer the execution
satisfying all temporal constraints. Consequently, several techniques have been
developed to check the dynamic controllability of a process [3,19].
Nevertheless, dynamic controllability might admit processes where each
admissible dynamic schedule requires some activities to start, without knowing
yet, when they need to complete. This leads to the subsequent sudden termina-
tion scheduling of their end-event. In Sect. 2 we give a precise demonstration and
an example of this phenomenon. While for some activities this poses no problem
(e.g. for waiting tasks), for other non-contingent activities a sudden termination
is highly undesirable, unacceptable, or even impossible, in particular, for activi-
ties involving human actors or invoking uninterruptible external services [10,12].
We call such activities semi-contingent, i.e. their duration between minimum and
maximum duration can be chosen by the process controller but only until the
activity starts. In Sect. 2 we give some examples for semi-contingent activities
and the sudden termination problem.
The research question we address here is: how to determine, whether a given
process can be scheduled without the risk of a sudden termination of a task?
Here we show that sudden termination can be identified by the presence
of specific constraint patterns, and propose a technique to check, whether it is
possible to (dynamically) schedule a process with the guarantee that no sudden
termination will be forced at run time.
The contributions of this paper are the following:
– The discovery and characterization of the problem of sudden termination,
which might arise in dynamically controllable processes
– The introduction of the notion of semi-contingent activities to model relevant
temporal characteristics of activities
– The identification and characterization of patterns of temporal constraints
and conditions, which are sufficient and necessary for the problem of sudden
termination
– The definition of semi-dynamic controllability, which specializes the tradi-
tional notion of dynamic controllability to address semi-contingent activities
and sudden termination
– A procedure to check semi-dynamic controllability
These results contribute to the development of a comprehensive framework
to support the design, modeling, and analysis of business processes at design
time and to monitor the time-aware execution of business processes at run-time.
The reminder of this paper is structured as follows: in Sect. 2 we illustrate
the problem with the help of examples. In Sect. 3 we introduce a lean process
model which allows the formulation of the problem, show how a specific pattern
can induce the problem, and show how to solve it. In Sect. 4 we provide an
implementation of a checking procedure. In Sect. 5 we discuss related works, and
in Sect. 6 we draw conclusions.
Scheduling Processes Without Sudden Termination 119
Arrange Shipment
[5, 7] semi-cont.
Receive Payment
[3, 5] contingent
For most general applicability, here we introduce a minimal process model, which
is sufficient to capture the patterns for which sudden termination may occur.
We consider the most common control flow patterns: sequence, inclusive and
disjunctive splits, and the corresponding joins. To avoid design flaws, and accord-
ing to the current state-of-the-art in this field, we assume that processes are
acyclic and block-structured [6].
We consider activity durations, process deadline, and upper- and lower-bound
constraints between events (start and end of activities). We measure time in
chronons, representing, e.g., hours, days, ..., which have domain the set of natural
numbers and are on an increasing time axis starting at zero. A duration is defined
as the distance between two time points on the time axis.
Finally, we distinguish between non-contingent, semi-contingent, and contin-
gent activities. The duration of contingent activities, by their nature, cannot
be controlled, thus it cannot be known, when they will actually terminate. The
process controller may however control the duration of non-contingent activities
at any time. This means in particular, that they are allowed to start with no
knowledge about the time when they have to end, thus conceding to be sud-
denly terminated. Semi-contingent activities, in contrast, require to know, at
their start time, when they must terminate: this means the process controller
can set their duration until the activity starts.
A schedule for a process states when each activity should be started and
terminated. If a schedule exists, we call the process controllable. Controllability is
often considered too strict, as it would not admit situations where, e.g. the time-
point for the start of an activity depends on the observed duration of preceding
contingent activities.
Dynamic controllability requires the existence of a dynamic schedule (or
dynamic execution strategy), where the decision about starting and ending activ-
ities can be made based on the timestamp of all earlier events.
There are several techniques for checking the dynamic controllability of pro-
cesses. We use here the technique of mapping a process model to a Simple
Temporal Network with Uncertainty (STNU) and apply constraint propagation
techniques which are proven to be sound and complete for checking dynamic
controllability [3]. We present this technique in Sect. 4.
In Sect. 2 we gave an example of a process which is dynamically controllable,
but suffers from the problem of sudden termination of a semi-contingent activity.
In the next section we explore a pattern of constraints, which may lead to the
sudden termination of an activity. We use this pattern to formulate a new notion
of controllability, which is somewhat stricter than dynamic controllability, and
introduce a technique to verify a process model for such a notion.
122 J. Eder et al.
Proof. We show that C.dmax + v < S.e − C.s ≤ C.dmin + w is a necessary and
sufficient condition that the activities S and C are not in a STP.
Necessary condition: we show that if the condition does not hold, a sudden
termination might occur. If the condition does not hold then S.s, S.d, C.s with
C.dmax + v < S.e − C.s ≤ C.dmin + w. This is only possible, if C.dmax + v >
C.dmin + w.
We now assume that C and S are in a STP. This means that ∀C.s, S.s there
is no S.d such that ∀C.d the constraints hold: S.s + S.d ≤ C.s + C.d + w and
S.s + S.d ≥ C.s + C.d + v. Hence S.d such that ∀C.d: C.d + v ≤ S.s + S.d −
C.s ≤ C.d + w. Which requires in particular, that S.d to satisfy C.dmax + v ≤
S.e − C.s ≤ C.dmin + w.
Sufficient condition: We show that if the inequality holds, sudden termination
does not occur. We show that ∃C.s, S.s, S.d such that ∀C.dmin ≤ C.d ≤ C.dmax
the constraints are satisfied, i.e. C.s + C.d + v ≤ S.s + S.d ≤ C.s + C.d + w, which
holds since ∀C.dmin ≤ C.d ≤ C.dmax : C.d + v ≤ C.dmax + v ≤ S.s + S.d − C.s ≤
C.dmin + w ≤ C.d + w.
This theorem can now be used to establish conditions that a process model
has to fulfill, such that it is dynamically controllable and a STP cannot
occur. In particular, we can show that a STP cannot occur, when the process
model includes a particular lower-bound constraint resp. upper-bound constraint
between the start of S and the start of C.
We are now ready to apply this result for checking, whether a sudden termination
problem can be avoided in the execution of a process. For each ST-constellation
Scheduling Processes Without Sudden Termination 125
In previous works [9] we showed how to check whether a process model, such as
the process in Fig. 1, is dynamically controllable by mapping it into a equivalent
STNU (Simple Temporal Network with Uncertainty) [19].
In a nutshell, a STNU is a directed graph, in which nodes represent time
points and edges represent constraints between time points. A special time point
zero marks the reference in time, after which all other time points occur. Edges
can be non-contingent or contingent. Non-contingent edges represent constraints
which can be enforced by the execution environment by assigning appropriate
values to the time points. Contingent edges (also called links) represent con-
straints which are guaranteed to hold, but the corresponding time point assign-
ments cannot be controlled, only observed.
We use the notation (A, B, δ) for non-contingent edges from A to B with
weight δ, which require that B ≤ A + δ; and (AC , l, u, C) for contingent links
between AC and C, which state that C occurs some time between l and u after
AC . For a detailed formalization on STNUs, we refer to [19].
Figure 2 shows the STNU derived by mapping the process model of Fig. 1.
Note that in Fig. 2 we adopted the usual STNU notation with contingent edges
dashed, inverted w.r.t. non-contingent edges, and labeled with the contingent
time point name. For a more compact presentation, in the figure we did not
include nodes resulting from the mapping of the par-split and par-join.
−5
Z 0 A.s A.e 0
0 0 −1 7 DE : −7 0
start P.s P.e −3 4 D.s D.e end
2 RE : −5 de : 2
0 R.s R.e 0
re : 3
21
Algorithm 1 shows the procedure we propose for checking the semi-dynamic con-
trollability. A process model P is mapped into a STNU T . Additionally, we keep
a data structure ST containing STNU nodes representing semi-contingent activ-
ities, which is needed for identifying sudden termination patterns.
First, T is checked for dynamic controllability by applying a constraint prop-
agation procedure check dc(T ), which as a side effect computes the closure of
the set of constraints. If the procedure returns True, the process is dc. Then
f ind stp(T, ST ) searches and returns all STPs. Then there is a loop (repeated
as long as the network is dc and there are unresolved STPs) with three steps:
(i) For each STP p found, edges corresponding to the constraints to resolve p to
avoid a sudden termination problem are added to T . (ii) check for dynamic con-
trollability and derive additional implicit constraints. (iii) search for unresolved
STPs. If at the end of the iteration T remains dc, then it is also sdc and True
is returned. One can verify (see the negative cycle introduced in Fig. 3 between
R.s and A.e) that the process of the running example is not sdc.
The correctness of the procedure trivially follows from the correctness of the
existing constraint propagation procedures, and from the Theorems of Sect. 3.
−5
Z 0 A.s A.e 0
0 0 −1 7 DE : −7 0
−8
start P.s P.e −3 4 D.s D.e end
7
2 RE : −5 de : 2
0 R.s R.e 0
re : 3
21
Fig. 3. STNU resulting from the application of Algorithm 1 to the process of Fig. 1.
We report the average measured execution times for the various process sizes
in Fig. 4. On average, executing Algorithm 1 for a process of size 10 required
0.13s; for size 20, 0.83s; for size 30, 6.29s; for size 40, 18.94s; for size 50, 41.00s.
Our experiments showed that, despite the addition of new constraints to solve
the STP, and the repeated execution of the dynamic controllability checking
procedure, the required computation times are still acceptable for a design time
check. Thus, we regard the proposed approach applicable for most practical
applications which require design time checking for semi-dynamic controllability.
45.00
40.00
35.00
30.00
Time [s]
25.00
20.00
15.00
10.00
5.00
0.00
10 20 30 40 50
Process size [number of nodes]
6 Related Work
efforts have been devoted in the last decades both to developing different notions
of controllability for temporal constraint networks, and to developing more
expressive network models [15,19,22,25]. With this work, we contribute to this
field of research introducing a specialization of dynamic controllability to avoid
a class of unwanted behaviors.
7 Conclusions
References
1. Bettini, C., Wang, X., Jajodia, S.: Temporal reasoning in workflow systems. Dis-
trib. Parallel Databases 11(3), 269–306 (2002)
2. Breu, R., et al.: Towards living inter-organizational processes. In: 15th IEEE Con-
ference on Business Informatics, pp. 363–366. IEEE (2013)
3. Cairo, M., Rizzi, R.: Dynamic controllability made simple. In: 24th International
Symposium on Temporal Representation and Reasoning (TIME 2017), vol. 90 of
LIPIcs, pp. 8:1–8:16 (2017)
4. Cardoso, J., Sheth, A., Miller, J., Arnold, J., Kochut, K.: Quality of service for
workflows and web service processes. J. Web Semant. 1(3), 281–308 (2004)
5. Cheikhrouhou, S., Kallel, S., Guermouche, N., Jmaiel, M.: The temporal perspec-
tive in business process modeling: a survey and research challenges. SOCA 9(1),
75–85 (2014). https://doi.org/10.1007/s11761-014-0170-x
6. Combi, C., Gambini, M.: Flaws in the flow: the weakness of unstructured busi-
ness process modeling languages dealing with data. In: Meersman, R., Dillon, T.,
Herrero, P. (eds.) OTM 2009. LNCS, vol. 5870, pp. 42–59. Springer, Heidelberg
(2009). https://doi.org/10.1007/978-3-642-05148-7 6
Scheduling Processes Without Sudden Termination 131
24. Zavatteri, M., Combi, C., Viganò, L.: Resource controllability of workflows under
conditional uncertainty. In: Di Francescomarino, C., Dijkman, R., Zdun, U. (eds.)
BPM 2019. LNBIP, vol. 362, pp. 68–80. Springer, Cham (2019). https://doi.org/
10.1007/978-3-030-37453-2 7
25. Zavatteri, M., Viganò, L.: Conditional simple temporal networks with uncertainty
and decisions. Theoret. Comput. Sci. 797, 77–101 (2019)
Process Mining (BPMDS 2020)
Cherry-Picking from Spaghetti:
Multi-range Filtering of Event Logs
Abstract. Mining real-life event logs results into process models which
provide little value to the process analyst without support for handling
complexity. Filtering techniques are specifically helpful to tackle this
problem. These techniques have been focusing on leaving out infrequent
aspects of the process which are considered outliers. However, it is exactly
in these outliers where it is possible to gather important insights on the
process. This paper addresses this problem by defining multi-range fil-
tering. Our technique not only allows to combine both frequent and non-
frequent aspects of the process but it supports any user-defined intervals
of frequency of activities and variants. We evaluate our approach through
a prototype based on the PM4Py library and show the benefits in com-
parison to existing filtering techniques.
1 Introduction
The goal of process mining is extracting actionable process knowledge using
event logs of IT systems that are available in the organizations [1]. Process
discovery is one of the areas of interest of process mining that is concerned with
the extracting the process models from logs. With the development of process
mining, a number of automated process discovery algorithms that address this
problem has appeared.
The problem with automated process discovery of process models from event
logs is that despite the variety of different algorithms, automated process dis-
covery methods all suffer from joint deficiencies when used for real-life event
logs [1]: they produce large spaghetti-like models and they produce models with
either low level of fitness to the event log, or have low precision or generalization.
Managing to correct these shortcomings proved to be a difficult task. Research
by Augusto et al. [2] states that for complex event logs it is highly recommended
to use filtering of the logs before automated process discovery techniques and
that without this type of filtering precision of the resulting models is close to
zero. The authors also highlight a research gap that is necessary to be closed
suggesting the need to develop a filter which will can be tuned at will to deal
with complex logs.
Therefore, the purpose of our study was to rectify this research gap by imple-
menting a new filter, able to capture both most frequent behavior and the rare
one. We created a prototype based on the PM4Py, process mining toolkit for
Python [3]. Our prototype is fully customizable in which the user define an
arbitrary number of ranges for both activities and variants of the process that
user wants to analyze. In this research, we demonstrate how our technique helps
to unveil new insights into the process using an illustrative example from the
real-world event log.
This paper is structured as follows. Section 2 describes the problem set-
ting and discusses common process mining techniques that rely on filtering
of the logs in order to simplify models. Further, we present different types of
filters and compare them. Finally, we derive requirements for new filter type.
Section 3 presents a conceptual description of our filter with the formal defi-
nitions, while Sect. 4 presents an example that emphasizes the benefits of this
technique. Section 5 shows the benefits of our technique against existing process
mining tools. Section 6 concludes the paper and discusses future work.
2 Theoretical Background
This section describes the problem and provides an overview on related literature
before deriving three requirements for a filtering technique.
the focus of existing techniques from both academia and practice has been on
filtering out this infrequent behaviour. We argue that in some cases, it is the
infrequent behaviour that gives us better important insights on problems in the
process, thus helping improvement. Indeed, existing tools such as ProM1 , Disco2
and Celonis3 are able to filter for specific behaviour. However, there is no way
to set these filters in such a way that multiple variants or activities are shown
together. This way of filtering leaves out important information, which might be
seen for instance by a combination of the most and the least frequent cases.
Let us illustrate the problem through a running example. Figure 1 shows a
simple complaint handling process adapted from [5]. The process works as fol-
lows. After a client files a complaint, (s)he immediately receives an automated
confirmation message. Next, an employee brings the application to a meeting
with colleagues in order to discuss a solution. The same employee is in charge
of contacting back the customer with an apology and proposes a solution. The
solution may be accepted or rejected by the client. In case of acceptance, the
solution is executed right away. In case of rejection, the employee contacts the
client to investigate on alternatives. As long as a reasonable alternative is found,
the employee has a new meeting with colleagues to discuss the solution and pro-
ceed as usual. If no alternative solutions can be found, the complaint is brought
to court and the process fails.
positive
Send automatic response
reply to Discuss solution Send apology Propose solution Execute solution
customer
Complaint Positive response Complaint
received received addressed
negative No alternative
response Evaluate solutions
acceptable Go to court
alternative
Negative Alternative Complaint not
response received solution exists addressed
There are several ways in which instances of the process may traverse the
depicted process model. The sunny case scenario, is the one in which an agree-
ment with the client is found right away. In a good process this case should occur
frequently. On the opposite, the rainy case scenario consists of the cases which
result in no agreement and the company is brought to court. In this case, the
costs sustained from the company may be much higher than settling for a solu-
tion. An intermediate scenario is the one in which a customer does not accept
the first proposed solution, but some iterations are done.
In order to improve the process, the company is interested to compare the
sunny case scenario in order to understand which were the decisions and the
proposed solutions that lead to the respective outcomes. Table 1a lists the activ-
ities involved in the process as well as their short labels for better readability.
1
www.promtools.org.
2
fluxicon.com/disco.
3
www.celonis.com.
138 M. Vidgof et al.
The same consideration also holds for events and activities. Indeed, the com-
pany might be interested in activities or events which occur within a specific
range of frequencies. For instance, the top 10 most frequent and the top 10 most
infrequent activities can play a role into guiding process redesign. In other words,
frequency of traces and activities do not necessarily reflect importance. There
may be extremely infrequent variants or activities which have a very high impact
on the process (e.g., Black Swans [10]). Hence, it is crucial that filtering does
not compromise this information.
edges that the first filter did not remove by using eventually-follows graphs.
However, process models mined using Inductive miner are often oversimplified.
A different approach to the previous two is Fuzzy Miner. This algorithm filters
noise directly on the discovered model using the desired level of significance and
correlation thresholds defined by users.
As we can see, there are numerous techniques and algorithms which can
be used to simplify event logs and models to help users understand the core
process better. However, all of them are achieving this by filtering out infrequent
behaviour, considering it to be the noise in the event logs [1]. We argue that this
is a substantial limitation that needs to be addressed since infrequent behaviour
can carry important information which is lost by filtering it out of the log. For
example, having an insight into rare cases can help companies detect errors in
the process or even detect fraud. Furthermore, none of the presented techniques
considers that users might want to observe a process model that comprises both
the most frequent and infrequent traces of the process.
RQ1. (Select variants). A filtering technique must allow the user to slice
the log. That is, it must offer a way of selecting process variants relevant to
the user.
RQ2. (Select activities). A filtering technique must be able to dice the log.
That is, it must offer a way of selecting the most relevant activities for the
user.
RQ3. (Multi-range filtering). A filtering technique must be able to slice
and dice on multiple ranges. That is, it must offer a way of selecting relevant
information form several frequency intervals.
Our technique is summarized in Fig. 2. It takes as input an event log and two
user defined multi-ranges. A multi-range is a set of intervals of frequencies. As we
use frequencies, interval boundaries are from 0 to 1, where [0,0] means that we
get the least frequent variant or activity, and [0,1] means that that we consider
all possible behavior. The aforementioned multi-ranges are used respectively by
Cherry-Picking from Spaghetti 141
two filter types: i) variants filter; and ii) activities filter. These two filters can be
used independently or consecutively. In the latter case, their application must
follow the order: variants filter first. The output of each filter is a simplified event
log, complying with filtering criteria. This event log can be used by any process
mining technique to generate a process model which allows the user to analyze
the data.
3.2 Preliminaries
3.3 Implementation
Our implementation provides two filters: the variants filter and the activities
filter. These two filters are composable but their application is not commutative,
i.e. it has to be performed in strictly defined order. Namely, first the variants filter
is applied and then the activities filter is applied on the results of the variants
filter. In case the former one filtered out some variants, only the activities present
in the remaining variants can be used in the latter one.
We are interested in filtering at multiple ranges in the event log. These ranges
represent frequencies expressed by the user in the form of sets of intervals. That
is, R = {[min0 , max0 ], [min1 , max1 ], . . . [minn , maxn ]} with mini <= maxi ,
i = 1, . . . , n signifies that the user want to retain from the log an amount of
information that falls into either of the intervals [min0 , max0 ], . . . , [minn , maxn ].
Ranges can be applied to both filtering on the variants level - referred to as Rv -
and filtering on the activities level - Ra . Since the range boundaries are specified
as frequency percentages, the minimum value of mini is 0, and the maximum
value of maxi is 1. We also establish that [min, max] means that the boundaries
of the interval are included and (min, max) means the boundaries are excluded.
With this definition we can express the non-overlaps condition on the ranges
specified by the user as ∀i, j ∈ [0...n] ⇒ [mini , maxi ] ∩ [minj , maxj ] = ∅. This
is a precondition for applying both the activity and the variants filters. In other
words, ranges may share boundaries but they must not overlap.
Our implementation consists of three main blocks. First, the ranges specified
by the user for each of the applied filters are checked for overlaps. If the ranges
are incorrect, an error is produced and the filtering is not applied.
Second, if the ranges are correct, the variants filter can be applied. The
variants are filtered according to Algorithm 1.
Third, we can apply Algorithm 2 on the resulting log. First, it builds a list
of activities sorted by their frequency, analogous to Algorithm 1. Then, a range
filter is applied in the same manner. Finally, we iterate over all traces in the
input log and rebuild them in such a way that only filtered activities remain in
the trace. The new trace is appended to the output log only in case it is not
empty, i.e. it contains at least one of the activities that should remain.
4 Results
Next, we built a prototype to evaluate our technique. This section presents the
results. First, we describe the experimental setup. Then we demonstrate that
our technique addresses all the requirements by applying our technique to the
running example we provided in Sect. 2.1. Last, we show the usefulness of our
technique in a real-life log.
144 M. Vidgof et al.
We generated a log of our example process in Fig. 1 using BIMP5 . The log con-
tains 1000 cases and was built with the following rules: i) positive response is
received with 80% probability; ii) negative response is received with 20% prob-
ability; iii) alternative solution exists with 80% probability; iv) no alternative
solutions exist with 20% probability.
In order to evaluate our technique, let us apply our prototype on this artifi-
cial log. As already mentioned, the two filters can be used both separately and
combined. First, we can use the variants filter to keep process behaviour that is
of interest to us. Let us say, we are interested in the most frequent and the least
frequent variants. To do that, we apply Algorithm 1 and specify two ranges for
the filter: Rv = {[0, 0.15], [0.9, 1]}. It means we want to keep the 15% least fre-
quent paths as well as 10% most frequent ones. It is very important to interpret
these ranges correctly: by saying we take 15% most infrequent paths we do not
mean taking 15% of the cases. Instead, we mean here paths that are between
the 0th and the 15th percentile in a list of all variants in the input log sorted by
their frequency.
We do not want to filter out any activities at this point, thus we specify
one range Ra = [0, 1] for the activities filter, meaning we want to keep 100%
of activities. This gives us a filtered log L that we can use further either in
PM4Py or in any other tool. Figure 3 shows a Petri net resulting from applying
Heuristics miner in ProM on the filtered log and adapted for better readability.
However, we may also want to filter activities at this point. Note that as
we already applied the first filter on our log, only the activities present in the
selected variants will be available for us to pick from. Let us say, we want to
see the least frequent activities as well as the ones of medium frequency but
not the most frequent ones. In order to do that, we can set multiple ranges for
4
https://github.com/MaxVidgof/cherry-picker.
5
http://bimp.cs.ut.ee.
Cherry-Picking from Spaghetti 145
Fig. 3. Model from the artificial log in Table 1 with variants ranges Rv =
{[0, 0.15], [0.9, 1]} produced by heuristics miner and transformed into a Petri net.
the activities filter: Ra = {[0, 0.1], [0.1, 0.3], [0.4, 0.6]}. You can also see that the
range boundaries are allowed to be the same but an overlap between ranges is
not allowed.
Figure 4 shows the resulting model, again, adapted to improve readability.
As we can see, it only includes the activities that are in the specified range: 40%
least frequent activities and some activities with medium frequency. However,
the new model does not contain the most frequent activities as they are outside
of the specified range. This allows the user to concentrate on the less frequent
and presumably more interesting activities.
Fig. 4. Model from the artificial log in Table 1 with variants ranges Rv =
{[0, 0.15], [0.9, 1]} and activities ranges Ra = {[0, 0.1], [0.1, 0.3], [0.4, 0.6]} produced by
heuristics miner and transformed into a Petri net.
Next, we applied our technique on a real-life event log of sepsis cases [8]. This
is a publicly available log containing more than 1000 traces and 15000 events,
each trace corresponding to a pathway through the hospital.
By exploring the log, we can find out that there are 846 different variants,
the most frequent of which includes only 35 cases that corresponds to slightly
more than 3% of all traces in the log. There are also 784 variants having only a
single conforming trace in the log. This means that the term frequent variant is
not applicable to this log. Thus, it makes little sense to apply the variants filter
on the log so we can set the range of the first filter to [0,1].
What is really of interest to us is the activities filter. While the filters of the
traditional process mining tools only allow to keep the most frequent activities,
which we will discuss in more detail in Sect. 5, our filter gives us more oppor-
tunities. Fir instance, we can decide to take a deeper look only into the least
frequent activities. For this, we would set the activities filter to a range of [0,
0.25]. But we can also add additional ranges to these filter. Let us say, apart
from the least frequent activities we are also interested in the one activity lying
at the 65th percentile of frequency. This is also possible, for this we just set the
second range to [0.65,0.65].
Fig. 5. Model from the real-life log with activities ranges [0,0.25] and [0.65,0.65] pro-
duced by heuristics miner and transformed into a Petri net
Now, if we apply the Heuristics miner on the filtered log and convert it to
a Petri net, we will get a model in Fig. 5. Again, here we only see the activities
that are in the specified range of frequency, and this picture cannot be achieved
by any other process mining tool.
Cherry-Picking from Spaghetti 147
5 Discussion
Process mining allows the users to turn event logs into process models. However,
real-life behaviour captured in these event logs of the process may be complex
and exhibit notable variability. This leads to so-called spaghetti models (Fig. 6a)
that are difficult to comprehend. Filtering reduces the complexity of such models
by limiting the number of traces used to produce the model or the number of
activities shown in the resulting model.
However, the users have little options to decide what information stays in the
model and what can be left out for the moment, since existing process mining
tools treat frequency as an ultimate measure of importance of a variant or an
activity. Due to this, they only offer the user to keep the most frequent activities
or paths. We claim, however, that a process can contain activities that are still
very important despite infrequency but the tools provide virtually no possibility
to include them and reduce complexity at the same time. Some of the tools
provide the option to focus on any single path - also possibly an infrequent
one - but then the big picture is lost and the process analyst has to manually
incorporate this path in the model in case it is important. Moreover, no tool
offers an option to focus on infrequent activities.
Our novel technique increases the utility of filtering event logs for the process
analysts by allowing to set multiple ranges of frequency both for filtering variants
and activities. Let us provide an illustrative example. Figure 6b shows a model
produced by PM4Py heuristics miner from the real-life log about sepsis cases that
we used in the previous section. Here, we used single-range filtering with Rv =
[0.65, 1] for the variants filter and Ra = [0.6, 1] for the activities filter. Figure 6c
is generated from a log where multi-range filtering was applied. In fact, only
a slight modification was done to the activities filter: Ra = {[0.3, 0.3], [0.6, 1]}.
This modification leads to the new activity Return ER - the patient returning
to the hospital - appearing in the model. This activity, judging from the name,
may be extremely important for the domain expert, although it does not happen
frequently.
As this example shows, our filtering technique fills the gap that other tech-
niques cannot fill. It does so by allowing the user to set multiple frequency ranges
for both variants and activities, which in turn makes it possible to focus on previ-
ously disregarded behaviour and gain insights about the process behaviour that
no other tool can provide. This can be beneficial in scenarios like monitoring
of safety-critical processes, or controlling for possible fraudulent behaviour in
companies. In such cases, it is of utmost importance that a filtering technique
does not leave out information about potentially harmful cases.
148 M. Vidgof et al.
6 Conclusion
References
1. van der Aalst, W.M.P.: Process Mining - Data Science in Action, 2nd edn. Springer,
Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4
2. Augusto, A., et al.: Automated discovery of process models from event logs: review
and benchmark. IEEE Trans. Knowl. Data Eng. 31(4), 686–705 (2019)
3. Berti, A., van Zelst, S.J., van der Aalst, W.M.P.: Process mining for python
(PM4PY): bridging the gap between process - and data science. CoRR
abs/1905.06169 (2019)
4. Conforti, R., La Rosa, M., ter Hofstede, A.H.M.: Filtering out infrequent behavior
from business process event logs. IEEE Trans. Knowl. Data Eng. 29(2), 300–314
(2017)
5. Dumas, M., La Rosa, M., Mendling, J., Reijers, H.A.: Fundamentals of Business
Process Management, 2nd edn. Springer, Heidelberg (2018). https://doi.org/10.
1007/978-3-642-33143-5
6. Günther, C.W., van der Aalst, W.M.P.: Fuzzy mining – adaptive process simplifi-
cation based on multi-perspective metrics. In: Alonso, G., Dadam, P., Rosemann,
M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 328–343. Springer, Heidelberg (2007).
https://doi.org/10.1007/978-3-540-75183-0 24
7. Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured
process models from event logs - a constructive approach. In: Colom, J.-M., Desel,
J. (eds.) PETRI NETS 2013. LNCS, vol. 7927, pp. 311–329. Springer, Heidelberg
(2013). https://doi.org/10.1007/978-3-642-38697-8 17
8. Mannhardt, F.: Eindhoven University of Technology. Dataset. Sepsis Cases - Event
Log (2016). https://doi.org/10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460
9. Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace clustering in process
mining. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008. LNBIP, vol.
17, pp. 109–120. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-
00328-8 11
10. Taleb, N.N.: The Black Swan: The Impact of the Highly Improbable, vol. 2. Ran-
dom house, New York (2007)
11. Veiga, G.M., Ferreira, D.R.: Understanding spaghetti models with sequence clus-
tering for ProM. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009.
LNBIP, vol. 43, pp. 92–103. Springer, Heidelberg (2010). https://doi.org/10.1007/
978-3-642-12186-9 10
12. Weerdt, J.D., vanden Broucke, S.K.L.M., Vanthienen, J., Baesens, B.: Active trace
clustering for improved process discovery. IEEE Trans. Knowl. Data Eng. 25(12),
2708–2720 (2013)
13. Weijters, A.J.M.M., Ribeiro, J.T.S.: Flexible Heuristics Miner (FHM). In: CIDM,
pp. 310–317. IEEE (2011)
Truncated Trace Classifier. Removal
of Incomplete Traces from Event Logs
1 Introduction
2 Preliminaries
In this section, we briefly introduce the process mining discipline. Then, we
define truncated traces.
Process mining brings data science and business process management closer
together [6]. As stated in the process mining manifesto, the starting point of
process mining is an event log [12]. An event log contains traces, which are
sequences of events. Event logs often contain additional information such as a
timestamp or the resource. We will use the simple event log definition introduced
152 G. Bernard and P. Andritsos
A truncated trace is an ongoing trace where the end of the process is missing.
Truncated traces are sometimes referred to as ‘incomplete cases’ [7,8], ‘incom-
plete traces’ [5], or ‘missing heads’ [10]. We favor the term truncated over the
term incomplete as the latter is often used for the concept of ‘event log incom-
pleteness’, referring to the fact that an event log will most likely not contain all
the combinations of behaviors that are possible because there are too many of
them [12]. For instance, when there is a loop in the process model, the number of
unique combinations is infinite. Event logs will most likely be incomplete while
they may not contain truncated traces.
There are several reasons to explain the existence of incomplete traces. They
might exist because of a flawed event log extraction process that cuts the traces
at a fixed date, leaving the traces that finish after truncated. This issue–named
‘the snapshots challenge’–has been identified by van der Aalst as one of the five
challenges that occurs when extracting event logs [6, chapter 5.3]. This type
of truncated trace could be avoided by extracting only the traces where no
event happens after the extraction date. However, once the data is extracted, we
cannot know which traces are truncated. As another example, incomplete traces
can exist because the events have not happened yet. This is especially relevant
when working with streaming data. Finally, truncated traces can result from a
wrong execution (e.g., the ticket was supposed to be closed but the agent forgot
to do it) or when the information system fails. In the next section, we introduce
a classifier to automatically detect truncated traces.
A TTC inputs the current execution of a trace and predicts whether it is trun-
cated. As shown in Table 1, we generate one input sample and one target for
each prefix length of each trace. The input sample represents the current state
of the process on which we apply a TTC. The target is a binary label that is
‘true’ when the trace is truncated or ‘false’ otherwise.
This setting implies that ‘real’ truncated traces that we would like to iden-
tify as such using the TTC would be labeled as complete. However, our intuition
is that the model will also learn from similar complete traces where the trun-
cated parts will be labeled as ‘truncated’. For illustration purposes, let us define
the following event logs: abc3 , ab. During the training phase, the sequence ab
Truncated Trace Classifier 153
appears three times as ‘truncated’ and once as ‘complete’. Hence, during the
prediction phase, the sequence ab would most likely be predicted as ‘truncated’.
Fig. 2. Illustration of the feature spaces for the input samples aca and acad.
Last Event. Relying only on the last event to predict that a trace is truncated
is one option often mentioned in the literature [5–8]. For example, the input for
sample #3 from Fig. 1 would be ‘c’.
Frequency. The ‘frequency’ feature space counts the occurrences of each event.
As shown in Fig. 2, aca becomes {a:2,b:0,c:1,d:0}. This feature space does not
record the order in which the events appear.
Sequence Tensor. A sequence tensor contains an extra ‘timestep’ dimension.
Each timestep is a matrix similar to the ‘last event’; i.e., it describes which event
happens. The extra dimension allows to describe the full sequence of timesteps
in a lossless way. The number of timesteps is equal to the longest sequence in
the event logs.
Once the input samples have been mapped into a feature space, it can be fed
together with the target to a classifier. We propose five TTC implementations
depicted in Fig. 3. As can be seen in Fig. 3, We also add a few base features: (1)
the number of activities in the prefix, (2) the number of seconds since the first
event in the trace, and (3) the number of seconds since the previous event in
154 G. Bernard and P. Andritsos
the trace. Such extra features were also added in the predictive business process
monitoring proposed by Tax et al. [3]. The five TTCs are described below.
1 LA (Last Activity). This TTC relies on the last activity to predict that
the trace is truncated.
2 FB (Frequency Based). This TTC uses the ‘frequency’ feature space
described in Sect. 3.
3 FB&LA. This TTC concatenates the TTCs ‘1 LA’ and ‘2 FB’ because
they both convey complementary information.
4 Soft (Softmax). This TTC corresponds to a next event prediction algo-
rithm. In fact, the implementation is similar to the predictive business process
monitoring from Tax et al. [3]. Predicting which event will occur is a multi-class
prediction problem. Thus, we rely on the Softmax function because it transforms
the output to a probability distribution. The end of the process is treated as any
other event. If the latter is predicted as the most likely next event, we predict
that the trace is complete. If not, we predict that the trace is truncated.
5 Sig (Sigmoid). The TTC ‘5 Sig’ turns the multi-class problem into a binary
one by using a one-vs-all strategy with the special ‘end’ event. We implemented
both TTCs to compare the accuracy when the neural network is specially trained
to recognize truncated traces (‘5 Sig’) or when the task is to predict the next
event (‘4 Soft’).
TTCs 1 to 3 use XGBoost1 [20], which stands for eXtreme Gradient Boosting.
It relies on an ensemble of decision trees to predict the target. This technique
is widely used among the winning solutions in machine learning challenges [20].
For the main settings, we set the number of trees to 200 and the maximum
depth of the trees to 8. The last two TTCs rely on a neural network imple-
mented in Keras [21]. As shown in Fig. 3, the architecture has two inputs. First,
the sequence tensor is passed to a Long Short-Term Memory (LSTM) network.
LSTM is a special type of Recurrent Neural Network (RNN) introduced in [22].
Compared to RNN, LSTM possesses a more advanced memory cell that gives
LSTM powerful modeling capabilities for long-term dependencies [3]. The out-
put of the LSTM network and the base features are provided to a fully connected
1
Available at https://github.com/dmlc/xgboost/tree/master/python-package.
Truncated Trace Classifier 155
layer. Both the LSTM network and the fully connected layer have 16 cells. We
use Adam [23] as an optimizer and we set the number of epochs to 100.
4 Benchmark
In this section, we benchmark the five TTCs described in the previous section,
in addition to a baseline approach.
4.1 Datasets
We used 13 event logs2 well known in the process mining literature. The event
logs come from “real-life” systems, offering the advantage of containing complex
traces and a wide range of characteristics visible in Table 1.
To the best of our knowledge, these event logs do not contain truncated
traces. However, this is difficult to confirm. For instance, exceptional events
might happen several months after the event log extraction date. In general,
without having a deep expertise of the domain under analysis and direct access
to the person in charge of the dataset extraction, it is not possible to guarantee
that all traces are complete. We use the term ‘false complete’ to refer to traces
that we wrongly consider complete during the training phase but that are in
fact truncated because more events will happen. We claim that a TTC should
be resilient to ‘false complete’. In other words, a TTC should not overfit on a
single ‘false complete’ and wrongly classify all similar traces as complete.
To test the resilience of the TTCs, we generated 0%, 10%, and 20% of ‘false
complete’ traces by randomly cutting them. The setting with 0% of ‘false com-
plete’ reflects how the TTC should be used with a real dataset, i.e., considering
all the traces as complete. For the two other settings, we kept track of the traces
that are truncated and refer to them as ‘ground truth’. To benchmark the var-
ious TTCs, we use the ground truth. For instance, let us define that abc is a
complete trace that we randomly cut to become the following ‘false complete’:
ab. During the training phase, we train the classifier to consider ab as a com-
plete trace, while during the evaluation ab should be classified as truncated to
be well classified.
ending activities that we will use to filter the truncated traces. As a baseline,
we use the method implemented in PM4Py which works as follows: First, the
number of occurrences of each activity as a last activity is counted. Let Ci be
the count of the ith most frequent end activity. We start by adding the most
frequent end activity, C1 , to S. Then, we calculate the decreasing factor of the
next most frequent activity using the following formula: Ci /Ci−1 . If the decreas-
ing factor is above a defined threshold we add Ci to S and move to next most
frequent activity. If the threshold is not met, we stop the process. We tried the
following thresholds: 0.40, 0.45, 0.50, 0.55, 0.60, 0.65, and 0.70. We report the
results obtained using a threshold of 0.60 as it is the one that yields the best
accuracy to detect truncated traces. Interestingly, it is also the default value in
PM4Py.
4.3 Evaluation
The first 80% of the traces were used to train the model, and the evaluation was
done on the remaining 20%. Out of the 80% of training data, 20% was used to
validate the parameters. To compare the ground truth with the output of the
TTC, we used the Matthews Correlation Coefficient (MCC) [25]. The MCC has
the nice property of using all the quantities of the confusion matrix, i.e., True
Positive (TP), True Negative (TN), False Positive (FP), and False Negative
(FN). Its value lies between −1 and 1, where 0 represents a random prediction,
1 a perfect score, and −1 an inverse prediction. It is defined as:
TP · TN − FP · FN
M CC(σ) =
(T P + F N )(F P + F N )(T N + F P )(T N + F N )
Truncated Trace Classifier 157
Figure 4 aggregates the results per TTC, while Fig. 5 contains the detailed
results. In Fig. 5, we can see a large MCC score gap per dataset. This gap high-
lights the various levels of complexity involved in detecting truncated traces.
Also, none of the techniques always outperforms the others. This is in line with
a similar conclusion that was drawn from large predictive business process moni-
toring experiments [26]. Nonetheless, when looking at Fig. 4, we observe that the
TTC ‘3 FB&LA’ has the highest median MCC score. Interestingly, the perfor-
mance of the baseline is comparable to the best implementations for the following
five datasets: BPI 13 CP, BPI 13 i, BPI 18, Env permit, Helpdesk (see Fig. 5).
For the other eight datasets, there is a clear drop in performance between the
baseline and more sophisticated methods. Looking at Table 1, we do not see any
clear dataset characteristics to explain the performance gap. We conclude that
looking at the last activity might work well, but for some datasets it is better to
use a more sophisticated TTC.
We also tested the null hypothesis that the results from different TTCs come
from the same distribution. To do this, we ran a permutation test with 100,000
random permutations and a p-value of 0.05. The results are visible in Fig. 6. As
can be seen, ‘3 FB&LA’ outperforms the baseline approach with strong statistical
significance. We also observe that transforming a multi-class problem into a
binary classifier–using the ‘4 Soft’ instead of the ‘5 Sig’–does not seem to improve
the ability of the TTC to detect truncated traces, as the MCC scores of Fig. 4
are comparable.
Figure 7 compares the execution time per TTC. The baseline takes in the
order of milliseconds to run. TTCs that are based on XGboost take in the order
of seconds or minutes to run, while approaches that rely on neural networks take
from minutes to hours to run. In fact, the ‘4 Soft’ and ‘5 Sig’ are on average 112
times slower than the other TTCs that rely on XGBoost.
median: 0.30
<
0 Baseline
0.49
<
1 LA
0.45
<
2 FB
0.55
<
3 FB&LA
0.43
<
4 Soft
0.44
<
5 Sig
0.0 0.2 0.4 0.6 0.8 1.0
Fig. 4. Boxplot showing the MCC scores per technique. Each dot depicts an individual
value. The median is written on top.
158 G. Bernard and P. Andritsos
Fig. 6. P-values of the permutation tests between the MCC scores per techniques.
Truncated Trace Classifier 159
Fig. 7. Execution time for the five TTCs. The vertical axis uses a logarithmic scale.
The full benchmark implementations, the parameters, and the event logs, as
well as the results are available online5 . The machine used for the experiment
has 61GB of RAM, 4 CPUs, and a GPU that speeds up the neural network
training phase.
A process discovery algorithm discovers a process model from an event log [6].
Because the discovered process model is based on event logs, it offers the advan-
tage of being a data-driven approach that shows how the process is really exe-
cuted. However, discovering a process model from an event log is a challeng-
ing task. Typically, process discovery algorithms are sensitive to noise [10–13].
Applying process mining techniques on traces that must supposedly be complete
but are instead truncated is no exception. The quality of a process model is com-
monly measured using four competing metrics [12]: (1) The precision measures
to what extent behaviors that were not observed can be replayed on the process
model. (2) The fitness measures to what extent the traces from the event logs
can be replayed on the model. (3) The generalization ensures that the model
does not overfit. Finally, (4) the simplicity measures the complexity involved to
read the process model. When facing truncated traces, a process discovery algo-
rithm will wrongly infer that the process can be stopped in the middle. This will
negatively impact the precision of the discovered process model. To solve this
issue, researchers advocate removing truncated traces [5,6,8]. As highlighted by
Conforti et al., “[t]he presence of noise tends to lower precision as this noise
introduces spurious connections between event labels” [11].
We ran an experiment to measure the impact of removing truncated traces
on the quality of the process models using PM4Py [24], a process mining library
in Python. We used the default metrics in PM4Py which are described in the
following papers: precision [27], fitness [6, p. 250], generalization [28], and sim-
plicity [29]. To start, we randomly generated 100 process models with the PM4py
5
https://github.com/gaelbernard/truncated-trace-classifier.
160 G. Bernard and P. Andritsos
7 Related Work
To the best of our knowledge we are the first to focus on the task of distinguishing
truncated from complete traces. Still, existing works—especially in the area of
predictive process monitoring—are relevant to uncover truncated traces.
Predictive process monitoring anticipates whether a running process instance
will comply with a predicate [4]. For instance, a predicate might be about the
process execution time, the execution of a specific event, or the total amount of
sales. As highlighted by Verenich et al., techniques in this space differ according
to their object of prediction [18]. A TTC is a specific type of predictive process
monitoring task where the predicate is whether we will observe more events.
In [31], Maggi et al. propose a generic predictive process monitoring approach.
Once the predicate is set, the most similar prefixes are selected based on the
edit distance. Finally, a classifier is used to correlate the goal with the data asso-
ciated with the process execution. Insights are then provided to the end-user
to optimize the fulfillment of the goal while the process is being executed. It
was later extended with a clustering step to decrease the prediction time [4].
Tax et al. propose a neural network that leverages LSTM that could serve as
162 G. Bernard and P. Andritsos
Table 3. Comparing the accuracy of predicting the next event and the execution time,
without and with a TTC.
8 Conclusion
Event logs are often noisy, which makes the application of process mining some-
times difficult in a real setting [13]. Typically, the existence of truncated traces
is known. Still, there is a research gap in systematically detecting them. In this
work, we treat the identification of truncated traces as a predictive process mon-
itoring task and we benchmark several TTCs using 13 complex event logs. We
show that building a TTC that consistently achieves high accuracy is challeng-
ing. This finding highlights the importance of conducting further research to
Truncated Trace Classifier 163
build an efficient TTC. Typically, for some event logs, using a baseline approach
that relies solely on the last activity works well. Still, we show that the TTC ‘3
FB&LA’ outperforms such baseline approach with strong statistical significance.
We also measure the process model quality impact when a process discovery
algorithm is run on event logs that contain truncated traces. We show that only
a few truncated traces can greatly decrease the process model quality and that
a TTC can alleviate this problem by automatically removing truncated traces.
Finally, we highlight the unexplored potential of a TTC to increase the accuracy
of predicting the next event. We expect that more benefits of TTCs are yet to
be discovered, especially in the predictive business process monitoring area.
In this work, we use the sequence of activities as well as some timing infor-
mation. Using more information such as the name of the resource, the day of
the week or any other event attributes could further improve the accuracy of the
TTCs. Higher accuracy could also be achieved by using different classifiers, try-
ing new neural network architecture, or implementing alternative feature spaces.
This is an area for future research where our work can serve as a baseline.
References
1. Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured
process models from incomplete event logs. In: Ciardo, G., Kindler, E. (eds.)
PETRI NETS 2014. LNCS, vol. 8489, pp. 91–110. Springer, Cham (2014). https://
doi.org/10.1007/978-3-319-07734-5 6
2. Bernard, G., Andritsos, P.: Accurate and transparent path prediction using process
mining. In: Welzer, T., Eder, J., Podgorelec, V., Kamišalić Latifić, A. (eds.) ADBIS
2019. LNCS, vol. 11695, pp. 235–250. Springer, Cham (2019). https://doi.org/10.
1007/978-3-030-28730-6 15
3. Tax, N., Verenich, I., La Rosa, M., Dumas, M.: Predictive business process mon-
itoring with LSTM neural networks. In: Dubois, E., Pohl, K. (eds.) CAiSE 2017.
LNCS, vol. 10253, pp. 477–492. Springer, Cham (2017). https://doi.org/10.1007/
978-3-319-59536-8 30
4. Di Francescomarino, C., Dumas, M., Maggi, F.M., Teinemaa, I.: Clustering-based
predictive process monitoring. IEEE Trans. Serv. Comput. 12, 896–909 (2016)
5. Bezerra, F., Wainer, J., van der Aalst, W.M.P.: Anomaly detection using process
mining. In: Halpin, T., Krogstie, J., Nurcan, S., Proper, E., Schmidt, R., Soffer, P.,
Ukor, R. (eds.) BPMDS/EMMSAD -2009. LNBIP, vol. 29, pp. 149–161. Springer,
Heidelberg (2009). https://doi.org/10.1007/978-3-642-01862-6 13
6. Aalst, W.: Data science in action. Process Mining, pp. 3–23. Springer, Heidelberg
(2016). https://doi.org/10.1007/978-3-662-49851-4 1
7. Fluxicon: Deal with incomplete cases, process mining in practice, disco 2.1 docu-
mentation. http://processminingbook.com/incompletecases.html
8. Verenich, I.: Explainable predictive monitoring of temporal measures of business
processes. Ph.D. thesis, Queensland University of Technology (2018)
9. Carmona, J., de Leoni, M., Depaire, B.: Process discovery contest. In: International
Conference on Process Mining 2019 (2019)
10. Suriadi, S., Andrews, R., ter Hofstede, A.H., Wynn, M.T.: Event log imperfection
patterns for process mining: towards a systematic approach to cleaning event logs.
Inf. Syst. 64, 132–150 (2017)
164 G. Bernard and P. Andritsos
11. Conforti, R., La Rosa, M., ter Hofstede, A.H.: Filtering out infrequent behavior
from business process event logs. IEEE Trans. Knowl. Data Eng. 29(2), 300–314
(2016)
12. van der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K.,
Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg
(2012). https://doi.org/10.1007/978-3-642-28108-2 19
13. Bose, R.J.C., Mans, R.S., van der Aalst, W.M.: Wanna improve process mining
results? In: 2013 IEEE Symposium on Computational Intelligence and Data Mining
(CIDM), pp. 127–134. IEEE (2013)
14. Selig, H.: Continuous event log extraction for process mining (2017)
15. Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured
process models from event logs containing infrequent behaviour. In: Lohmann, N.,
Song, M., Wohed, P. (eds.) BPM 2013. LNBIP, vol. 171, pp. 66–78. Springer, Cham
(2014). https://doi.org/10.1007/978-3-319-06257-0 6
16. de Medeiros, A.K.A., Weijters, A.J., van der Aalst, W.M.: Genetic process min-
ing: an experimental evaluation. Data Min. Knowl. Disc. 14(2), 245–304 (2007).
https://doi.org/10.1007/s10618-006-0061-7
17. Verenich, I., Dumas, M., La Rosa, M., Maggi, F.M., Di Francescomarino, C.: Com-
plex symbolic sequence clustering and multiple classifiers for predictive process
monitoring. In: Reichert, M., Reijers, H.A. (eds.) BPM 2015. LNBIP, vol. 256, pp.
218–229. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42887-1 18
18. Verenich, I., Dumas, M., La Rosa, M., Maggi, F.M., Chasovskyi, D., Rozumnyi, A.:
Tell me what’s ahead? Predicting remaining activity sequences of business process
instances (2016)
19. Leontjeva, A., Conforti, R., Di Francescomarino, C., Dumas, M., Maggi, F.M.:
Complex symbolic sequence encodings for predictive monitoring of business pro-
cesses. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M. (eds.) BPM 2015.
LNCS, vol. 9253, pp. 297–313. Springer, Cham (2015). https://doi.org/10.1007/
978-3-319-23063-4 21
20. Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings
of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, pp. 785–794. ACM (2016)
21. Chollet, F.: Keras (2015). https://github.com/fchollet/keras
22. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8),
1735–1780 (1997)
23. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint
arXiv:1412.6980 (2014)
24. Berti, A., van Zelst, S.J., van der Aalst, W.: Process mining for python
(PM4PY): bridging the gap between process-and data science. arXiv preprint
arXiv:1905.06169 (2019)
25. Matthews, B.W.: Comparison of the predicted and observed secondary structure of
t4 phage lysozyme. Biochimica et Biophysica (BBA)-Acta Protein Struct. 405(2),
442–451 (1975)
26. Di Francescomarino, C., et al.: Genetic algorithms for hyperparameter optimization
in predictive business process monitoring. Inf. Syst. 74, 67–83 (2018)
27. Muñoz-Gama, J., Carmona, J.: A fresh look at precision in process conformance.
In: Hull, R., Mendling, J., Tai, S. (eds.) BPM 2010. LNCS, vol. 6336, pp. 211–226.
Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15618-2 16
28. Buijs, J.C., van Dongen, B.F., van der Aalst, W.M.: Quality dimensions in process
discovery: the importance of fitness, precision, generalization and simplicity. Int.
J. Coop. Inf. Syst. 23(01), 1440001 (2014)
Truncated Trace Classifier 165
29. Blum, F.: Metrics in process discovery. Technical report, TR/DCC. 1–21 (2015)
30. Jouck, T., Depaire, B.: Ptandloggenerator: a generator for artificial event data
(2016)
31. Maggi, F.M., Di Francescomarino, C., Dumas, M., Ghidini, C.: Predictive monitor-
ing of business processes. In: Jarke, M., et al. (eds.) CAiSE 2014. LNCS, vol. 8484,
pp. 457–472. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07881-
6 31
32. Nguyen, H., Dumas, M., La Rosa, M., Maggi, F.M., Suriadi, S.: Business process
deviance mining: review and evaluation. arXiv preprint arXiv:1608.08252 (2016)
33. Bertoli, P., Di Francescomarino, C., Dragoni, M., Ghidini, C.: Reasoning-based
techniques for dealing with incomplete business process execution traces. In: Bal-
doni, M., Baroglio, C., Boella, G., Micalizio, R. (eds.) AI*IA 2013. LNCS (LNAI),
vol. 8249, pp. 469–480. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-
03524-6 40
Secure Multi-party Computation for
Inter-organizational Process Mining
1 Introduction
Contemporary process mining techniques enable users to analyze business pro-
cesses based on event logs extracted from information systems [1]. The out-
puts of process mining techniques can be used, for example, to identify per-
formance bottlenecks, waste, or compliance violations. Existing process mining
techniques require access to the entire event log of a business process. Usu-
ally, this requirement can be fulfilled when the event log is collected from one
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 166–181, 2020.
https://doi.org/10.1007/978-3-030-49418-6_11
Secure Multi-party Computation for Inter-organizational Process Mining 167
1
https://eur-lex.europa.eu/eli/reg/2016/679/oj.
2
https://www.hhs.gov/hipaa/.
168 G. Elkoumy et al.
data, the uploaded event log is encrypted and exposed neither to the platform
operator nor other involved parties. Nonetheless, the MPC platform enables the
computation over the encrypted data through protocols for result sharing among
the nodes.
We realize the above architecture to answer analysis queries that are common
in process mining. Specifically, we show how to construct a frequency and time-
annotated Directly-Follows Graph (DFG), which is a starting point for process
discovery algorithms and performance analysis. While keeping the computed
DFG private, we are revealing only the output of performance analysis queries
such as finding the top-k bottlenecks (i.e. activities with longer cycle time) or the
top-k most frequent hand-offs. We implement our proposed architecture using
the Sharemind platform [7]. In order to tackle scalability issues that would be
imposed by a naive implementation, we employ vectorization of event logs and
propose a divide-and-conquer scheme for parallel processing of sub-logs. We test
the effectiveness of these optimizations in experiments with real-world event logs.
The remainder of the paper is structured as follows. Section 2 lays out related
work and the background for this work. Section 3 introduces our architecture
for privacy-preserving inter-organizational process mining along with the opti-
mizations needed for efficient implementation. An experimental evaluation is
presented in Sect. 4, before Sect. 5 concludes the paper.
three computing parties. In this paper, we build on top of Sharemind, but our
techniques are also applicable to other secret sharing-based MPC systems.
In Sharemind, a party can play different roles: an input party, a computation
party, and/or an output party. In the case where only two parties are involved in
an inter-organizational process, these two parties play the role of input parties
and also that of computing parties. To fulfill the requirements of Sharemind,
they need to enroll a third computing node, which merely performs computations
using secret shares from which it can infer no information.3
The Sharemind framework provides its own programming language, namely
the SecreC language [6], for programming privacy-preserving applications.
SecreC allows us to abstract away certain details of cryptographic protocols.
This section introduces our techniques for process mining based on secure multi-
party computation. Section 3.1 first clarifies our model for inter-organizational
process mining including the required input data and the obtained analysis
results. We then introduce our architecture for realizing the respective analy-
sis using secure multi-party computation in Sect. 3.2. In Sect. 3.3, we elaborate
on vectorization and parallelization to improve the efficiency of our approach.
3
When three or more parties are involved in a process, no external party is required.
Secure Multi-party Computation for Inter-organizational Process Mining 171
DFG of the inter-organizational process. The basic DFG captures the frequencies
with which the executions of two activities have been observed to directly follow
each other in a trace. Moreover, we consider temporal annotations of the directly-
follows dependencies in terms of time between the respective activity executions.
Queries over the frequency and time-annotated DFGs allow us to analyze the
main paths of the process, the rarely executed paths, as well as the activities
that most contribute to delays in a process. Note though that only query answers
are to be revealed whereas the actual DFG shall be kept private.
Formally, the time-annotated DFG is captured by an |A| × |A| matrix, where
A is the set of all possible activities of the process. Each cell contains a tuple
(c, Δ). The counter c represents the frequency with which a directly-follows
dependency has been observed in L, i.e., for the cell (a1 , a2 ) it is the num-
ber of times that two events e1 = (i1 , a1 , ts1 ) and e2 = (i2 , a2 , ts2 ) follow each
other directly in some trace (i.e., i1 = i2 ) of L. Also, Δ is the total sum of the
time passed by between all occurrences of the respective events, i.e., ts2 − ts1
for the above events.
In inter-organizational process mining, the above time-annotated DFG can-
not be computed directly, as this would require the parties to share their sub-logs.
Party
A Import
CSV
Party
B
Secure Processing Query Engine Revealing
Cooperating Parties
Using Secret Shares Query Results
Preprocessing. Each party performs the preparation of its log at its own site. The
parties share the number of unique activities and the maximum number of events
per trace. In Fig. 3a, we show an example with two traces. In the preprocessing
step, all traces are padded to the same length, as illustrated with the blue event
in Fig. 3a. The activities are transformed into a one-hot encoding that is used
for masking at the DFG calculation step, as will be explained later. The logs are
sorted by traces.
Combination. The parties upload their event logs La and Lb to the MPC
platform in a secret-shared manner. That is, the values (i, a, ts) of each event
(encoded as integers) are split into shares, which do not provide any information
on the original values and are stored at different nodes of the platform. This way,
each party can only see the total number of records uploaded by each party, but
not the particular data. Subsequently, the logs are unified, creating a single log
of events L. The combination is performed in a manner to divide the logs into
processing chunks. As long as we are making the number of events per trace is
fixed, that is possible by dividing the index by the number of traces for each
event and assigning data from the same trace to the same chunk. In Fig. 3a, the
system processes one trace with its own chunk.
DFG Matrix Calculation. Next, we construct the DFG matrix inside the MPC
platform, keeping it secret. Since the information on the activity of an event is
secret-shared, we cannot simply process the events of traces sequentially as the
matrix cell to update would not be known. Hence, we adopt a one-hot encoding
for activities, so that each possible activity is represented by a binary vector of
length |A|. To mask the actual number of possible activities, the set over which
the vector is defined may further include some padding, i.e., the vector length
can be chosen to be larger than |A|. Now, if we compute the outer product of such
vectors for activities a1 and a2 , we get a mask matrix M such that M [a1 , a2 ] = 1,
while all other entries are 0. An example of such masks is given in Fig. 3b. The
first mask represents the directly-follows dependency from activity A to B of
our running example. The second mask encodes the directly-follows dependency
from activity A to C. For all sequential pairs of events in the sorted log, we sum
up these matrices to get the frequency count c of the directly-follows dependency
Secure Multi-party Computation for Inter-organizational Process Mining 173
La La
T1, A, 1 T1, 00001, 1
T1, 00001, 1
T1, D, 4 T1, 01000, 4 T1, 00001, 1
T1, 01000, 4
T2, A, 3 T2, 00001, 3 L[0,:,:] L[0,:,:] T1, 00010, 2
T1, 00010, 2
T2, E, 6 T2, 10000, 6 T1, 00100, 3
T1, 00100, 3 Parallel
Preprocessing Combine T1, 01000, 4
Sort
Lb Lb T2, 00001, 3
T2, 00000, 0
T1, B, 2 T1, 00010, 2 T2, 10000, 6
L[1,:,:] T2, 00001, 3
T1, C, 3 T1, 00100, 3 L[1,:,:] T2, 00010, 5
T2, 00010, 5
T2, B, 5 T2, 00010, 5 T2, 00000, 0
T2, 10000, 6
T2, 00000, 0
Fig. 3. Example of two event logs and their processing steps inside the system
for (a1 , a2 ). Multiplying M by the duration between two events further enables
us to derive the total sum duration passed, i.e., Δ, of the directly-follows-relation.
The duration operation is performed between every two consecutive events of
the same trace. We can perform the duration calculation by using an element-
wise vector subtraction by duplicating the dataset and then shifting its events
by one as in Fig. 3c. Technically, the outer product is a function that is realized
as a protocol over secret-shared data in Sharemind, and its runtime complexity
is linear in |A| [15].
However, the above approach could mix up events of different traces. We
therefore also compute a flag b that is 1, if the trace identifiers of two events
are equivalent, and 0 otherwise, which is illustrated as the “Same Trace Flag”
column in Fig. 3c. Then, we multiply the mask matrix M by b, so that the
values of M are ignored, if b = 0. Again, the functionality for comparison and
multiplication can be traced back to predefined protocols in Sharemind. We show
the DFG matrix with counts of our running example in Fig. 3d.
Algorithm 1 summarizes the computation of the annotated DFG from the
sorted, combined log L, where [[·]] denotes a secret-shared data value.
handover events between the two parties. Based on the secret-shared DFG, the
respective activities may be identified through grouping and sorting the events,
similar to the procedure outlined above, which is again based on the predefined
protocols of an MPC platform such as Sharemind.
group of traces, generating an annotated DFG per group, and finally integrating
the different DFGs. Since events of the same trace will never occur in different
chunks, instead of sorting one log of length n, we will need to sort c chunks of
length n/c each. Since the communication complexity of a privacy-preserving
quicksort is O(n · log n) [14], this improves efficiency.
The above approach raises the question of determining the size of the chunks.
Separating each trace reveals the total number of events of that trace provided
by a party, which may be critical from a privacy perspective. On the other hand,
a small chunk size reduces the overhead of sorting. This leads to a trade-off
between runtime performance and privacy considerations.
However, in our current implementation, all chunks must have the same
length, as Sharemind allows parallel sorting only for equal-length vectors. There-
fore, we apply padding to the traces in the log, adding dummy events (for which
an empty vector in the one-hot encoding represents the activity so that the
events are ignored for the DFG calculation) until the number of events of the
longest trace is reached. Such padding may be employed locally, by each party,
and also has the benefit that the length of individual traces is not revealed.
4 Evaluation
We implemented the proposed approach on top of the Sharemind multi-party
computation platform.4 The source code of our implementation is available at
https://github.com/Elkoumy/shareprom. The implementation is written using
the SecreC programming language supported by Sharemind.
Using this implementation, we conducted feasibility and scalability experi-
ments, specifically to address the following research questions:
RQ1: How do the characteristics of the input event logs influence the perfor-
mance of the secure multi-party computation of the DFG?
RQ2: What is the effect of increasing the number of parallel chunks on the
performance of the multi-party computation of the DFG?
4.1 Datasets
The proposed approach is designed to compute the DFG of an inter-
organizational process where the event log is distributed across multiple parties,
and each party is responsible for executing a subset of the activities (i.e. event
types) of the process. We are not aware of publicly available real-life datasets
with this property. We identified a collection of synthetic inter-organizational
business process event logs [8]. However, these logs are too small to allow us
to conduct performance experiments (a few dozen traces per log). On the other
hand, there is a collection of real-life event logs of intra-organizational processes
comprising logs of varying sizes and characteristics5 . From this collection, we
selected three logs with different size and complexity (cf. Table 1):
4
https://sharemind-sdk.github.io.
5
https://data.4tu.nl/repository/collection:event logs real.
176 G. Elkoumy et al.
BPIC 2013 This event log captures incident and problem management process
at an IT department of a car production company.
Credit Requirement This event log comes from a process for background
checking for the purpose of granting credit at a Dutch bank. It has a simple
control-flow structure: All traces follow the same sequence of activities.
Traffic Fines This event log comes from a process to collect payment of fines
from traffic law violations at a local police office in Italy.
The experiments focus on the time needed to construct the annotated DFG,
since it is the most sophisticated and time-consuming portion of the proposed
analysis pipeline, due to the communication required between the compute
nodes. Once the annotated DFG is available, stored in a secret-shared manner,
the calculation of the actual queries has a lower complexity.
4.3 Results
Runtime Experiment. In Fig. 4a, we illustrate the observed execution time when
varying the number of chunks used in the parallelization. We plot a bar for
each chunk size. Each bar represents the runtime of the parallel sort in blue
and the run time of the DFG calculation in orange. From Fig. 4a, we conclude
that the runtime decreases with an increasing number of chunks, due to the
parallel sorting of chunks. We also note that the runtime for the DFG calculation
stays constant. In Fig. 4b, we report the number of processed events per second
when varying the number of chunks. We find a consistent improvement for the
throughput across all event logs.
Regarding RQ1, we summarise that the proportion of runtime between sort-
ing and DFG calculation differs based on the event log characteristics. For the
log with the largest number of event types, the DFG calculation makes up the
most substantial proportion of the total runtime. In contrast, the proportion is
significantly lower for the logs with a smaller number of event types. A possible
explanation for this finding is the increasing size of the vectors required to rep-
resent each activity due to our bit-vector representation. Such increase results in
more computational heavy calculations. Regarding RQ2, we conclude that the
runtime decreases for event logs with an increasing number of chunks.
Threats to Validity. The evaluation reported above has two limitations. First, the
event logs used in the evaluation, while coming from real-life systems, are intra-
organizational event logs, which we have split into separate logs to simulate an
inter-organizational setting. It is possible that these logs do not capture the com-
munication patterns found in inter-organizational processes. Second, the number
of event logs is reduced, which limits the generalizability of the conclusions. The
results suggest that the proposed technique can handle small-to-medium-sized
logs, with relatively short traces, but there may be other characteristics of event
logs that affect the performance of the proposed approach.
178 G. Elkoumy et al.
1000
10
1
1 10 100 1000 10000 1 10 100 1000 10000 1 10 100 1000 10000
No. of Chunks
20
dataset
Credit Requirement
15
Traffic Fines
BPIC 2013
10
5
1 10 100 1000 10000 70000
No. of Chunks
1e+07
1e+05 server
server1
server2
1e+03
server3
1e+01
5 Conclusion
This paper introduced a framework for inter-organizational process mining based
on secure multi-party computation. The framework enables two or more parties
to perform basic process mining operations over the partial logs of an inter-
organizational process held by each party, without any information being shared
besides: (i) the output of the queries that the parties opt to disclose; and (ii)
three high-level log statistics: the number of traces per log, the number of event
types, and the maximum trace length. The paper specifically focuses on the com-
putation of the DFG, annotated with frequency and temporal information. This
is a basic structure used by process mining tools to perform various operations,
Secure Multi-party Computation for Inter-organizational Process Mining 179
Acknowledgments. This research is partly funded by ERDF via the Estonian Centre
of Excellence in ICT (EXCITE) and the IT Academy programme.
References
1. van der Aalst, W.M.P.: Process Mining - Data Science in Action, Second edn.
Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4
2. Aksu, Ü., Schunselaar, D.M., Reijers, H.A.: A cross-organizational process min-
ing framework for obtaining insights from software products: accurate comparison
challenges. In: 2016 IEEE 18th Conference on Business Informatics (CBI), vol. 1,
pp. 153–162. IEEE (2016)
3. Araki, T., Furukawa, J., Lindell, Y., Nof, A., Ohara, K.: High-throughput semi-
honest secure three-party computation with an honest majority. In: Proceedings of
the 2016 ACM SIGSAC Conference on Computer and Communications Security,
Vienna, Austria, 24–28 October 2016, pp. 805–817 (2016)
4. Archer, D.W., et al.: From keys to databases—real-world applications of secure
multi-party computation. Comput. J. 61(12), 1749–1771 (2018)
5. Bauer, M., Fahrenkrog-Petersen, S.A., Koschmider, A., Mannhardt, F., van der
Aa, H., Weidlich, M.: Elpaas: event log privacy as a service. In: Proceedings of the
Dissertation Award, Doctoral Consortium, and Demonstration Track at BPM 2019
co-located with 17th International Conference on Business Process Management,
BPM 2019, Vienna, Austria, 1–6 September 2019, pp. 159–163 (2019)
180 G. Elkoumy et al.
1 Introduction
Business process management (BPM) has long researched business processes through a
prescriptive lens. This perspective emphasizes the static nature of processes, their design,
and their execution. The descriptive perspective, on the other hand, acknowledges the
dynamic nature of processes and their intentional and unintentional adjustment over time
due to changes in the environment, technological capabilities, seasonal differences, and
other factors [1, 2]. Similarly, literature in routine research has increasingly recognized
the dynamic change of routine executions over time [3, 4].
To address this dynamism, recent research aims to identify changes in processes
over time. Most prominently, process drift uses event logs to detect points in time when
changes take place [1, 5]. While process drift offers insights into when changes in the
process happen, it does not provide information on how different executions of the
same process interrelate with each other, in particular, when and how similar process
executions occur and dominate. Thus, the meaningful interpretation of process evolu-
tion over time, i.e., the changes in the process execution over time, remains a research
challenge. Evolution is defined as a recurring variation, selection, and retention among
entities of a designated population [6]. We build on this concept by declaring all traces
© Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 185–192, 2020.
https://doi.org/10.1007/978-3-030-49418-6_12
186 A. Yeshchenko et al.
generated during the execution of a process as the designated population, while the
variation, selection, and retention (i.e. the appearance, disappearance, and change) hap-
pens among sub-groups of these traces. This is due to scarce environmental resources
[6] which correspond to the organizational resources available for the execution of the
process.
In order to advance our understanding of business process evolution, this research
aims to answer the following research question: How can temporal dynamics of differ-
ent process executions be analyzed? To approach this question, we propose a process
evolution analysis (PEA) method that uses process execution clustering and time-series
line plots to study the evolution of the process execution clusters over time. PEA thereby
aims to show insights on how clusters of process executions are interrelated overtime on
a process evolution graph.
This paper proceeds as follows. Section two discusses the main concepts. Section
three describes our research method. Section four presents the results of applying PEA.
Section five concludes with a discussion of the results, limitations of the approach, and
future research directions.
2 Background
In this section, we describe the main concepts used in this paper. First, we describe
and emphasize the difference in our approach to process drift. Next, we outline the
process mining method to find trace clustering. It is used to reduce the complexity of
process mining problems by grouping the process instances based on their similarity, i.e.,
trace fitness, execution time, etc. Lastly, we summarize the state of the art of different
visualization techniques that influenced our technique.
Process drift refers to a change in processes executions and is known as a key challenge
in process mining [7]. Process drift is extensively studied in the process mining literature
[1, 5, 8, 9] with a focus on drift points, drift types [1, 8], and drift visualizations [5, 9].
Seeliger et al. [10] define process drift as a significant behavioral change during the
execution of a process that has occurred over time. A drift in a process, according to
their definition, occurs when almost all traces are influenced by that drift [10]. Similarly,
Maaradji et al. [1] focus on the detection of process drifts by identifying statistically
significant differences between the observed process behavior of different intervals in
time. Yeshchenko et al. [5] focus on the detection of separate behavioral drifts within a
process that are simultaneously present in the event log.
Most existing process drift research interprets process change as a change of the
underlying control flow (implicit process model change). However, changes in the pro-
cess can also be viewed through various process performance measures that are important
for the analysis of the process executions [11]. These measures show how the process
characteristics (such as cycle time, cost per instance, etc.) change over time. Existing
process drift approaches thereby focus on the discovery of behavioral change within
the event log, ignoring the impact on process measures. That creates a process overview
Visualizing Business Process Evolution 187
where the analyst might be incapable of inferring what change in the process is important
enough to manage. Our method allows for tracking the process KPIs with correspondence
to the change.
between the cases within the same cluster [12, 13]. Another technique to cluster the log
is using the cycle time to distribute the cases over the clusters [15]. Thus, all the cases
that have a long cycle time are in the same cluster, while cases with a short cycle time
are in another cluster.
Process evolution graphs – We use the time-series line plot [18, 20, 21] and theme
river [19, 22] visualization in order to analyze the change in the behavior between the
different clusters. The graph shows the change of a process metric over time for each
trace cluster. Different process execution characteristics can be measured; for instance,
the number of executed cases per day, the average cycle time of cases per month, or the
average resource workload per day.
By combining trace clustering with the computation of process metrics and visual-
izing the result on a time-series plot, behavioral changes in the process execution can be
made visible.
4 Findings
Experiment setup – We conduct two experiments on two datasets and visualize process
evolution using different trace clustering techniques. The first experiment uses the log
from the help desk of an Italian software company1 . The traces are clustered based on
the cycle time per trace with the python script, made available on GitHub2 . The clusters
are used to visualize process evolution using a theme-rivers chart. Each generated cluster
represents a different cycle time behavior. The second experiment uses the traffic logs
from BPIC3 . We applied a trace clustering based on trace fitness using ActiTrace ProM
plugin [12] with the following settings: target ICS fitness 0.9, the maximal number
of clusters is 6, and add the remaining traces to other clusters. Each generated cluster
represents a different process behavior. Moreover, we visualize process evolution using
a time-series line chart.
For the first experiment, Fig. 2 shows the number of cases for each cluster over 50
months. The x-axis depicts the months within this timespan, while the y-axis represents
the number of active cases. As has been described before, the traces were clustered
according to the cycle time of each case. The 25% of cases with the slowest cycle time
1 https://data.4tu.nl/repository/uuid:0c60edf1-6f83-4e75-9367-4c63b3e9d5bb.
2 https://github.com/yesanton/Visualizing-Business-Process-Evolution.
3 https://data.4tu.nl/repository/uuid:270fd440-1057-4fb9-89a9-b699b47990f5.
Visualizing Business Process Evolution 189
Fig. 2. Number of cases for each cluster over time for help desk event log
are in Cluster 1. Cluster 2 and 3 follow to cluster 1 in steps of 25%. Cluster 4 contains
25% of cases, which were the fastest.
The graph can be analyzed in several ways. First, the cluster containing the most
active cases at a certain point in time can be identified. For instance, cluster 1 (25%
slowest cases in the process) contains the most active cases in the first five months. In
the sixth month, however, cluster 2 is dominating the other clusters, until most cases can
be found in cluster 1 from month 15 to 26. Second, the discrepancy in case distributions
between the different clusters can be observed. One example is the timespan from the
7th to 20th month, in which there is a substantial discrepancy of active cases for the
clusters (i.e., the number of cases in each cluster differ considerably) compared to the
timespan between the 24th and 28th month (i.e., the clusters contain a similar amount of
cases). Third, it can be seen whether the number of cases for the clusters correlates over
time. For instance, all clusters in the timespan between the 33rd and 36th month decrease
similarly in number. However, between the 27th and 32nd month, cluster 3 and cluster
4 negatively correlate. Fourth, quick changes in a cluster can be spotted. For instance,
cluster 2 is more than doubled its active cases within a quarter of a year from month 7
onwards.
For the second experiment, Fig. 3 shows the number of cases for each cluster over
56 months. Differently to the previous experiment, the clusters were not generated by
grouping traces with similar cycle time but based on their trace fitness (ICS-fitness).
Six clusters were generated. It can be seen that cluster 1 dominated all other clusters
in terms of active cases until month 41. From this point in time, the cluster does not
contain enough active cases to be visible on the graphical representation. Until the 41st
month, cluster 1 also seems to correlate with cluster 2, while the magnitude of changes
for cluster 1 is much bigger. Sudden changes are also visible. In month and 28, cluster
1 increases substantially in the number of active cases within one month by the factor 3
to 4. Clusters 3, 4, 5, and 6 are having fewer cases than 250 for most of the time; thus,
they are not paying a significant role during the execution of the process.
Cluster 3 is dominating clusters 4, 5, and 6 in the period between the 1st to 7th and
53 to 55th . Aside from that period, no clear dominance can be found within these four
rd
Fig. 3. Number of cases for each cluster overtime for traffic fines event log
In this paper, we described the process evolution analysis (PEA) method, which visual-
izes different clusters of process executions and their interrelation over time. PEA builds
on the concepts of trace clustering, process drift, and process visualization.
PEA has several practical and theoretical implications. From a practical viewpoint,
PEA can be used as a technique to analyze process executions and to identify undesired
behaviors that emerge. For instance, if the clusters are generated by categorizing cases
based on their cycling time, a rise of cases with a long cycling time can be identified, and
take the required actions. Thereby, PEA can be used for real-time analysis of process
execution to intervene directly, or as a retrospect, analysis to identify recurring patterns of
undesired process behaviors. The same applies to clusters, which were generated by their
trace fitness. The underlying process models from these clusters can be generated through
process discovery (e.g., by the use of a process mining tool) and thereby unwanted
process executions discovered. The evolution of these clusters can again be made explicit
through PEA.
From a theoretical point, this is a first step towards synthesizing research on trace
clustering, process drift, and process visualization into one technique. By doing so, we
enabled the identification and analysis of changes within the different clusters of traces
over time. While process drift focuses mainly on points in time in which the general
process behavior changes [10], PEA shows how these underlying behavioral changes
interrelate over time. It also extends existing research on process clustering, which has
mainly focused on discovering the underlying process models of these clusters for further
analysis.
We recommend conducting future research in the following directions. First, alterna-
tive forms of visualization could be investigated. One such promising visualization is the
stacked graphs, which is designed to facilitate the identification of patterns, trends, and
unexpected occurrences [23]. Future research should investigate which forms of visu-
alization for PEA are useful, taking into consideration different application scenarios.
Second, different information besides the number of active cases within a cluster could
be plotted on the graph. For instance, the y-axis could depict the average cycle time for
active cases, which start at a specific time. Third, other forms of clustering the traces
could be explored. In this paper, we clustered the traces according to their trace fitness
Visualizing Business Process Evolution 191
and the cycle time of the cases. One alternative could be to cluster the traces according
to the costs or revenue of the cases to investigate the evolution of these clusters.
References
1. Maaradji, A., Dumas, M., La Rosa, M., Ostovar, A.: Fast and Accurate Business Process Drift
Detection. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23063-4
2. Pentland, B., Recker, J., Kim, I.: Capturing reality in flight ? Empirical tools for strong process
theory. In: Thirty Eighth International Conference on Information Systems, pp. 1–12, Seoul
(2017)
3. Feldman, M.S., Pentland, B.T.: Reconceptualizing organizational routines as a source of
flexibility and change. Adm. Sci. Q. 48, 94 (2003). https://doi.org/10.2307/3556620
4. Pentland, B.T., Feldman, M.S., Becker, M.C., Liu, P.: Dynamics of organizational routines:
a generative model. J. Manag. Stud. 49, 1484–1508 (2012). https://doi.org/10.1111/j.1467-
6486.2012.01064.x
5. Yeshchenko, A., Di Ciccio, C., Mendling, J., Polyvyanyy, A.: Comprehensive process drift
detection with visual analytics. In: Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M.
(eds.) ER 2019. LNCS, vol. 11788, pp. 119–135. Springer, Cham (2019). https://doi.org/10.
1007/978-3-030-33223-5_11
6. van de Ven, A.H., Poole, M.S.: Explaining development and change in organizations. Acad.
Manag. Rev. 20, 510–540 (1995). https://doi.org/10.2307/258786
7. van der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S.
(eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). https://doi.org/
10.1007/978-3-642-28108-2_19
8. Maaradji, A., Dumas, M., Rosa, M.La, Ostovar, A.: Detecting sudden and gradual drifts in
business processes from execution traces. IEEE Trans. Knowl. Data Eng. 29, 2140–2154
(2017). https://doi.org/10.1109/TKDE.2017.2720601
9. Denisov, V., Belkina, E., Fahland, D., Van Der Aalst, W.M.P.: The performance spec-
trum miner: visual analytics for fine-grained performance analysis of processes. In: CEUR
Workshop Proceedings, vol. 2196, pp. 96–100 (2018)
10. Seeliger, A., Nolle, T., Mühlhäuser, M.: Detecting concept drift in processes using graph
metrics on process graphs. In: ACM International Conference Proceeding Series Part F1271
(2017). https://doi.org/10.1145/3040565.3040566
11. Dumas, M., La Rosa, M., Mendling, J., Reijers, H.A.: Fundamentals of Business Process
Management. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-642-33143-5
12. De Weerdt, J., Vanden Broucke, S., Vanthienen, J., Baesens, B.: Active trace clustering for
improved process discovery. IEEE Trans. Knowl. Data Eng. 25, 2708–2720 (2013). https://
doi.org/10.1109/TKDE.2013.64
13. De Leoni, M., Van Der Aalst, W.M.P., Dees, M.: A general process mining framework for
correlating, predicting and clustering dynamic behavior based on event logs. Inf. Syst. 56,
235–257 (2016). https://doi.org/10.1016/j.is.2015.07.003
14. Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace clustering in process mining. In:
Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008. LNBIP, vol. 17, pp. 109–120. Springer,
Heidelberg (2009). https://doi.org/10.1007/978-3-642-00328-8_11
15. Bose, R.P.J.C., van der Aalst, W.M.P.: Trace clustering based on conserved patterns: towards
achieving better process models. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM
2009. LNBIP, vol. 43, pp. 170–181. Springer, Heidelberg (2010). https://doi.org/10.1007/
978-3-642-12186-9_16
192 A. Yeshchenko et al.
16. Aigner, W., Miksch, S., Schumann, H., Tominski, C.: Visualization of Time-Oriented Data.
Springer, London (2011). https://doi.org/10.1007/978-0-85729-079-3
17. Liu, S., Wu, Y., Wei, E., Liu, M., Liu, Y.: StoryFlow: tracking the evolution of stories. IEEE
Trans. Vis. Comput. Graph. 19, 2436–2445 (2013). https://doi.org/10.1109/TVCG.2013.196
18. Cui, W., Liu, S., Wu, Z., Wei, H.: How hierarchical topics evolve in large text corpora.
IEEE Trans. Vis. Comput. Graph. 20, 2281–2290 (2014). https://doi.org/10.1109/TVCG.
2014.2346433
19. Sung, C.Y., Huang, X.Y., Shen, Y., Cherng, F.Y., Lin, W.C., Wang, H.C.: Exploring online
learners’ interactive dynamics by visually analyzing their time-anchored comments. Comput.
Graph. Forum. 36, 145–155 (2017). https://doi.org/10.1111/cgf.13280
20. Liu, S., Yin, J., Wang, X., Cui, W., Cao, K., Pei, J.: Online visual analytics of text streams.
IEEE Trans. Vis. Comput. Graph. 22, 2451–2466 (2016). https://doi.org/10.1109/TVCG.
2015.2509990
21. Wu, Y., Liu, S., Yan, K., Liu, M., Wu, F.: OpinionFlow: visual analysis of opinion diffusion
on social media. IEEE Trans. Vis. Comput. Graph. 20, 1763–1772 (2014). https://doi.org/10.
1109/TVCG.2014.2346920
22. Havre, S., Hetzler, B., Nowell, L.: ThemeRiver: visualizing theme changes over time. In:
Proceedings of IEEE Symposium on Information Visualization, pp. 115–123 (2000). https://
doi.org/10.1109/infvis.2000.885098
23. Byron, L., Wattenberg, M.: Stacked graphs – geometry & aesthetics. IEEE Trans. Vis. Comput.
Graph. 14, 1245–1252 (2008). https://doi.org/10.1109/TVCG.2008.166
Mining BPMN Processes on GitHub
for Tool Validation and Development
1 Introduction
Mining software repositories, i.e., the systematic retrieval, processing and anal-
ysis of data about software artifacts and software development from software
forges and repositories, has drawn considerable attention in recent years. On the
one hand, the increasing popularity and usage of platforms like GitLab.com,
Bitbucket.org, SourceForge.net or GitHub.com for collaborative software
development provide a tremendous source of data, encompassing software from
a rich and heterogeneous spectrum of domains. On the other hand, the data
mining techniques available today have made it possible to start to seize this
treasure, and thus allow to answer research questions and empirically validate
hypotheses on the development and usage of IT and software systems based upon
real-world data. While the research field is mainly focused on source code of con-
ventional programming languages, mining software repositories can as well help
to understand more about the use of other artifacts in software development [12].
In particular the area of modeling languages, such as the Unified Modeling
Language (UML) or the Business Process Model and Notation (BPMN) [4], can
benefit from a data-driven approach like mining software repositories. There is a
common lack of larger datasets with real-world models, which hinders empirical
research in this area [20,28,30]. Retrieving systematically a corpus of models
by mining software repositories promises to overcome this lack. For instance,
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 193–208, 2020.
https://doi.org/10.1007/978-3-030-49418-6_13
194 T. S. Heinze et al.
research questions on how modeling languages are used in practice can be inves-
tigated based on the mined corpus, in order to distinguish the more frequently
used and important parts of a language from unimportant parts and thus guide
language and tool development. Analyzing the different modeling styles in the
corpus allows for identifying best practices and guidelines to help model design-
ers. Furthermore, best practices and tools proposed by academic research or by
industry can be validated in a more realistic manner. Currently, there are often
case studies, including only a small and homogeneous set of models, or artifi-
cial examples used when evaluating new tools and methods, with consequences
for an evaluation’s validity. Instead using a large set of real-world models as
retrieved by mining software repositories allows for increasing validity. In this
respect, the prior work on the creation of the Lindholmen dataset with UML
models mined from software repositories on GitHub.com was an inspiration for
this paper [12,28].
Empirical research is limited by the access to primary sources. Especially in
case of modeling languages used for business processes, like BPMN, where the
model is usually the product and thus subject to strict nondisclosure restric-
tions [30], this can pose an insurmountable challenge. Previous empirical studies
on business process modeling and BPMN were using methods like experiments,
surveys or case studies, each implying limitations to their generalizability [23]. In
an experiment, a certain aspect, e.g., a modeling practice, is typically researched
by differentiating two groups based on the aspect. The need to provide a large
population and to strictly control the experiment’s environment for all other
variables, however, restricts the applicability of this method to narrow research
problems. Surveys allow for more general problems, but are subject to bias, intro-
duced, e.g., by the selection of survey participants or by inaccurate responses
from the participants. Case studies are frequently used and in particular allow
for insights into real-world practices and constraints. Compared to experiments,
there is though no control of environment and influencing variables. Further-
more, reproducibility and comparability is usually not given. Due to the often
homogeneous origin of business process models included in case studies, their
findings also need to be validated by other research to increase generalizabil-
ity [20,23]. Most of the empirical research focused on conceptual process models
and omitted implemented and executable process models [23]. This also applies
to community efforts to assemble collections of business process models like the
BPM Academic Initiative [20], which mostly covers educational process models.
Mining software repositories for business process models, as introduced in the
following can be seen as another empirical approach, complementing established
methods.
In this paper, we present our efforts to create a corpus of BPMN process
models by mining software repositories hosted on GitHub.com. Due to the sheer
amount of repositories and the bottleneck caused by the GitHub API ’s rate
limit, we limited our search to a random subset of 6,163,217 repositories, or
10% of all repositories on GitHub.com in November 2018. As a result, we were
able to identify 1,251 repositories with at least one potential BPMN artifact
Mining BPMN Processes on GitHub for Tool Validation and Development 195
end, the BPMN standard not only includes a graphical modeling notation, but
also a machine-processable serialization format for process model interchange
and the natural language definition of an execution semantics. In Fig. 1, a sim-
ple sample BPMN process model, consisting of a single human task, is shown in
its graphical modeling notation as well as in the XML-based interchange format.
Process modeling with BPMN is known to be error-prone, in particular when
it comes to executable business processes [9,10,22]. This is due to deliberate deci-
sions on the language’s design, e.g., unstructured vs structured process modeling
and implied control flow errors, and to the complexity of applications and under-
lying technologies [29]. To illustrate the latter, consider an executable process
package deployed on a process engine like Camunda, which not only includes
the BPMN process model, but also XML schema files, Groovy or JavaScript
code snippets, Java classes, and various configuration scripts. Accordingly, there
exists a large body of work on process analysis tools, both from industry and
academic research, to help process designers to detect modeling errors as early
as possible.
Proposed tools span the whole spectrum of static analysis, i.e., automated
rule-based inspection of process models for finding modeling errors or flaws. Lint-
ing tools, like bpmnlint 4 or BPMNspector [10] can be used to check business
process models for suspicious and non-portable modeling styles, mostly on the
syntactical level. More elaborate tools allow for checking conformance to best
practices, e.g., Signavio 5 , or identifying control and data flow anomalies, e.g.,
deadlocks [31], processing of undefined data [29], and support even more specific
analysis problems like data leak detection [13,16]. Eventually, full-fledged model
checking and verification tools can prove the compliance of a process model to
certain desirable properties, e.g., proper termination known as soundness [9], by
mapping the process model to a formalism like Petri nets [7,14]. Recapitulating
4
https://github.com/bpmn-io/bpmnlint.
5
https://www.signavio.com.
Mining BPMN Processes on GitHub for Tool Validation and Development 197
the tool evaluations in the literature, we observe the frequent use of case studies,
where the most thorough evaluation in [9] comprises 735 process models.
Web Scraping and Git. Since GitHub.com implements a web interface for man-
aging and accessing hosted software repositories, conventional web scraping can
also be used for extracting repository data out of the websites’ HTML code.
Eventually, knowing the URL of a software repository allows for cloning the
repository using the standard git tooling. However, as cloning a repository is
time-consuming due to downloading its complete history and all its contents,
this may not scale in the presence of thousands or even millions of repositories.
Our approach of mining software repositories for BPMN 2.0 process models
uses a combination of the methods discussed above, inspired by previous work on
the creation of the Lindholmen dataset with UML models from GitHub.com [12,
28]. The approach consists of four steps, which we conducted in the beginning
of 2019, see also Fig. 2: (1) Get a list of repositories hosted on GitHub.com and
select a proper subset thereof, (2) Find and extract potential BPMN process
model artifacts as well as associated metadata, (3) Examine the artifacts to
identify BPMN 2.0 process models and clean up the resulting data, (4) Analyze
the resulting set of process models for answering our research questions:
11
https://cloud.google.com/bigquery/public-data (table github repos.contents).
Mining BPMN Processes on GitHub for Tool Validation and Development 199
Data Extraction. All 6,163,217 repositories have been examined for BPMN
process model artifacts, using three steps for each repository. Similar to [12],
the default branch of the repository (master in most cases, though not always)
and its latest commit is identified first, using up to three queries to the GitHub
API. Afterwards, the repository structure is accessed for this commit, in order
to generate the list of files for the repository, again querying the GitHub API.
We reused and adapted the Python scripts12 from [12,28] for implementing this
step. Note that alone this step would require more than 100 days for our repos-
itory subset using the credentials of a single user due to the rate limit imposed
by the GitHub API. In order to increase throughput, we therefore conducted
data extraction using several user credentials, which were donated, in parallel.
However, this step was still the bottleneck of our approach and lasted 31 days.
Based on the resulting lists, we then scanned for potential BPMN process
model artifacts. Having tried several heuristics, we opted for simply including
all files with the term "bpmn" in their name or file extension. As a result, we
found 1,251 repositories with at least one potential artifact and overall 21,306
potential artifacts. Each of the identified repositories was then cloned locally
with the standard git tooling. Metadata was extracted from the downloaded
repositories using Code Maat 13 . Information about the repository, its metadata
and the identified artifacts were stored in a relational database for later analysis.
Analysis. In the final step, we analyzed the retrieved metadata and the iden-
tified BPMN process models. The former analysis was mainly implemented by
querying a relational database. For the latter analysis, we processed the models
and ran the tool BPMNspector 15 [10] for them. The tool’s reports were fed back
into the database and afterwards analyzed and aggregated using SQL queries.
12
https://github.com/LibreSoftTeam/2016-uml-miner.
13
https://github.com/adamtornhill/code-maat.
14
http://doubles.sourceforge.net.
15
https://github.com/uniba-dsg/BPMNspector.
200 T. S. Heinze et al.
The corpus of BPMN process models and more information, including the list of
identified repositories and their metadata, is available online [15]16 . The scripts
used to implement the mining process can be obtained from the same source.
When just considering the serialized BPMN 2.0 process models, the identified
artifacts where distributed over 928 software repositories.
16
https://github.com/ViktorStefanko/BPMN Crawler.
Mining BPMN Processes on GitHub for Tool Validation and Development 201
While we apparently used a very simple heuristic for identifying BPMN pro-
cess models and therefore may have missed many BPMN models hosted on
GitHub.com, we nevertheless found a substantial number of artifacts and also of
serialized BPMN 2.0 files. The resulting number of BPMN process models clearly
exceeds the numbers used in case studies (compare with Sect. 2), but is smaller
than the numbers reported for UML models (21,316 in [12] and 93,596 in [28]).
The share of 0.02% of repositories with at least one potential BPMN process
model artifact is also smaller than the share of 2.8% reported for UML in [12].
The results can be well explained by UML being a family of general-purpose
modeling languages, while BPMN is a domain-specific modeling language. Note
that UML is also older than BPMN. The reports on UML also present a larger
fraction of images among the identified UML models (51.7% in [12] and 61.8%
in [28]), which may be reasoned by their more permissive heuristic to consider
files with terms like "diagram" or "design" in their name as UML models.
Age. We also looked at the age of the identified potential BPMN process model
artifacts, i.e., the time passed since their last modification in a repository. Unsur-
prisingly, most artifacts are recent. More than each third artifact was modified
in the last year at the time of conducting the study in the beginning of 2019:
Age in years < 1 1 2 3 4 >4 n.a.
Number of 7, 656 5, 154 3, 291 2, 344 712 2, 079 70
artifacts (36.0%) (24.2%) (15.4%) (11.0%) (3.3%) (9.8%) (0.3%)
We though sporadically found artifacts older than 8 years, which thus do not
reference the BPMN 2.0 standard. The results can be well explained by the
exponential growth of the number of software repositories on GitHub.com.
Size. Analyzing process model size was conducted for the 8,904 distinct process
models in the BPMN 2.0 serialization format. The XML-based format includes
the XML node <process>, which defines a process’ logical structure [4]. To get
a simple measure for the size of a model, we thus simply counted the number of
children elements for the <process> node. As can be seen in the following table,
the corpus of BPMN 2.0 process models includes a range of different model sizes:
Mining BPMN Processes on GitHub for Tool Validation and Development 203
5 Related Work
Most related to our work is the creation and research on the Lindholmen
dataset 19 , which was also the inspiration for our approach. The creators of
the dataset describe the used mining software repositories approach [12], intro-
duce the dataset [28], and report on insights gained about the use of UML on
GitHub.com by analyzing the dataset, e.g., in [5,18]. Their main research ques-
tion was though on the usage of UML in conventional software development,
while we were mainly interested in using our corpus to validate analysis tools for
17
http://bpmnspector.org/ConstraintList EXT.html.
18
https://github.com/matthiasgeiger/BPMNspector-fixSeqFlow.
19
http://oss.models-db.com/.
Mining BPMN Processes on GitHub for Tool Validation and Development 205
6 Conclusion
facts, originating from 1,251 repositories, which after further filtering and cleans-
ing constituted a corpus of 8,904 distinct serialized BPMN 2.0 process models.
The corpus can be used to answer various empirical research questions on the use
of business process models. We here demonstrate, how to complement an exist-
ing case study for the linting tool BPMNspector with an evaluation on a much
larger scale. Doing so, we can confirm results on the frequency of violations of
the BPMN standard, thus showing the need for analysis tools like BPMNspector.
Threats to Validity. There are a number of threats that affect the validity of
our approach. For a general discussion on the threats of mining software reposi-
tories, we refer the reader to [19]. Process modeling in software repositories may
not resemble industrial practice and our results may thus not generalize beyond
open software development and academia [19,23]. This is a common threat to
external validity, which we also find for other studies, e.g. [20]. Analyzing the
transferability of empirical results about software repositories and academia to
an industrial context is an open research question. We therefore advocate the
complimentary use of our approach with other empirical research methods. Fur-
thermore, we just mined GitHub.com and did not consider other software forges.
Since GitHub.com counts the largest number of hosted repositories, we though
believe that our results apply for most open software development. Due to the
heuristics used for the identification of BPMN models, we also have missed pro-
cess models and thus may underestimate certain effects [12], e.g., model dupli-
cation or the frequency of graphical process models. We therefore only provide
a descriptive analysis of the use of BPMN on GitHub.com and do not use our
corpus for inferential statistics and prediction. Finally, due to GitHub.com being
a dynamic environment, repositories may change or be removed over time.
References
1. Agrawal, K., Aschauer, M., Thonhofer, T., Bala, S., Rogge-Solti, A., Tomsich, N.:
Resource classification from version control system logs. In: EDOC Workshops
2016, pp. 1–10. IEEE (2016)
2. Bala, S., Cabanillas, C., Mendling, J., Rogge-Solti, A., Polleres, A.: Mining project-
oriented business processes. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M.
(eds.) BPM 2015. LNCS, vol. 9253, pp. 425–440. Springer, Cham (2015). https://
doi.org/10.1007/978-3-319-23063-4 28
3. Bala, S., Mendling, J.: Monitoring the software development process with process
mining. In: Shishkov, B. (ed.) BMSD 2018. LNBIP, vol. 319, pp. 432–442. Springer,
Cham (2018). https://doi.org/10.1007/978-3-319-94214-8 34
4. Business Process Model and Notation (BPMN), Version 2.0. Object Management
Group (OMG) Standard (2011). https://www.omg.org/spec/BPMN/2.0/PDF
5. Chaudron, M.R.V., Fernandes-Saez, A., Hebig, R., Ho-Quang, T., Jolak, R.: Diver-
sity in UML modeling explained: observations, classifications and theorizations. In:
Tjoa, A.M., Bellatreche, L., Biffl, S., van Leeuwen, J., Wiedermann, J. (eds.) SOF-
SEM 2018. LNCS, vol. 10706, pp. 47–66. Springer, Cham (2018). https://doi.org/
10.1007/978-3-319-73117-9 4
Mining BPMN Processes on GitHub for Tool Validation and Development 207
6. Corradini, F., Fornari, F., Polini, A., Re, B., Tiezzi, F.: RePROSitory: a repository
platform for sharing business PROcess modelS. In: BPM PhD/Demos 2019, pp.
149–153. CEUR (2019)
7. Dijkman, R.M., Dumas, M., Ouyang, C.: Semantics and analysis of business process
models in BPMN. Inf. Softw. Techn. 50(12), 1281–1294 (2008)
8. Dumas, M., Rosa, M.L., Mendling, J., Reijers, H.A.: Fundamentals of Business
Process Management, 2 edn. Springer, Heidelberg (2018)
9. Fahland, D., Favre, C., Jobstmann, B., Koehler, J., Lohmann, N., Völzer, H.,
Wolf, K.: Instantaneous soundness checking of industrial business process mod-
els. In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS,
vol. 5701, pp. 278–293. Springer, Heidelberg (2009). https://doi.org/10.1007/978-
3-642-03848-8 19
10. Geiger, M., Neugebauer, P., Vorndran, A.: Automatic standard compliance assess-
ment of BPMN 2.0 process models. In: ZEUS 2017, pp. 4–10. CEUR (2017)
11. Gousios, G.: The GHTorent dataset and tool suite. In: MSR 2013, pp. 233–236.
IEEE (2013)
12. Hebig, R., Quang, T.H., Chaudron, M., Robles, G., Fernandez, M.A.: The quest
for open source projects that use UML: mining GitHub. In: MODELS 2016, pp.
173–183. ACM (2016)
13. Heinze, T.S., Amme, W., Moser, S.: Process restructuring in the presence of
message-dependent variables. In: Maximilien, E.M., Rossi, G., Yuan, S.-T., Lud-
wig, H., Fantinato, M. (eds.) ICSOC 2010. LNCS, vol. 6568, pp. 121–132. Springer,
Heidelberg (2011). https://doi.org/10.1007/978-3-642-19394-1 13
14. Heinze, T.S., Amme, W., Moser, S.: Static analysis and process model transforma-
tion for an advanced business process to Petri net mapping. Softw.: Pract. Exp.
48(1), 161–195 (2018)
15. Heinze, T.S., Stefanko, V., Amme, W.: Mining von BPMN-Prozessartefakten auf
GitHub. In: KPS 2019, pp. 111–120 (2019). https://www.hb.dhbw-stuttgart.de/
kps2019/kps2019 Tagungsband.pdf
16. Heinze, T.S., Türker, J.: Certified information flow analysis of service implementa-
tions. In: SOCA 2018, pp. 177–184. IEEE (2018)
17. Ho-Quang, T., Chaudron, M.R.V., Robles, G., Herwanto, G.B.: Towards an infras-
tructure for empirical research into software architecture: challenges and directions.
In: ECASE@ICSE 2019, pp. 34–41. IEEE (2019)
18. Ho-Quang, T., Hebig, R., Robles, G., Chaudron, M.R.V., Fernandez, M.A.: Prac-
tices and perceptions of UML use in open source projects. In: ICSE-SEIP 2017, pp.
203–212. IEEE (2017)
19. Kalliamvakou, E., Gousios, G., Blincoe, K., Singer, L., German, D.M., Damian,
D.E.: The promises and perils of mining GitHub. In: MSR 2014, pp. 92–101. ACM
(2014)
20. Kunze, M., Luebbe, A., Weidlich, M., Weske, M.: Towards understanding process
modeling – the case of the BPM academic initiative. In: Dijkman, R., Hofstetter,
J., Koehler, J. (eds.) BPMN 2011. LNBIP, vol. 95, pp. 44–58. Springer, Heidelberg
(2011). https://doi.org/10.1007/978-3-642-25160-3 4
21. Lenhard, J., Ferme, V., Harrer, S., Geiger, M., Pautasso, C.: Lessons learned from
evaluating workflow management systems. In: Braubach, L., et al. (eds.) ICSOC
2017. LNCS, vol. 10797, pp. 215–227. Springer, Cham (2018). https://doi.org/10.
1007/978-3-319-91764-1 17
22. Leopold, H., Mendling, J., Günther, O.: Learning from quality issues of BPMN
models from industry. IEEE Softw. 33(4), 26–33 (2016)
208 T. S. Heinze et al.
23. Lübke, D., Pautasso, C.: Empirical research in executable process models. Empir-
ical Studies on the Development of Executable Business Processes, pp. 3–12.
Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17666-2 1
24. Mendling, J.: Empirical studies in process model verification. In: Jensen, K., van der
Aalst, W.M.P. (eds.) Transactions on Petri Nets and Other Models of Concurrency
II. LNCS, vol. 5460, pp. 208–224. Springer, Heidelberg (2009). https://doi.org/10.
1007/978-3-642-00899-3 12
25. Mendling, J., Sánchez-González, L., Garcı́a, F., Rosa, M.L.: Thresholds for error
probability measures of business process models. J. Syst. Softw. 85(5), 1188–1197
(2012)
26. Pinggera, J., et al.: Styles in business process modeling: an exploration and a model.
Softw. Syst. Model. 14(3), 1055–1080 (2013). https://doi.org/10.1007/s10270-013-
0349-1
27. Pinggera, J., et al.: Tracing the process of process modeling with modeling phase
diagrams. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol.
99, pp. 370–382. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-
28108-2 36
28. Robles, G., Ho-Quang, T., Hebig, R., Chaudron, M., Fernandez, M.A.: An extensive
dataset of UML models in GitHub. In: MSR 2017, pp. 519–522. IEEE (2017)
29. Schneid, K., Usener, C.A., Thöne, S., Kuchen, H., Tophinke, C.: Static analysis of
BPMN-based process-driven applications. In: SAC 2019, pp. 66–74. ACM (2019)
30. Skouradaki, M., Roller, D., Leymann, F., Ferme, V., Pautasso, C.: On the road
to benchmarking BPMN 2.0 workflow engines. In: ICPE 2015, pp. 301–304. ACM
(2015)
31. Vanhatalo, J., Völzer, H., Leymann, F.: Faster and more focused control-flow analy-
sis for business process models through SESE decomposition. In: Krämer, B.J., Lin,
K.-J., Narasimhan, P. (eds.) ICSOC 2007. LNCS, vol. 4749, pp. 43–55. Springer,
Heidelberg (2007). https://doi.org/10.1007/978-3-540-74974-5 4
An Empirical Investigation of the Intuitiveness
of Process Landscape Designs
Abstract. Process landscapes define the scope and relationships between an orga-
nization’s business processes and are therefore essential for their management.
However, in contrast to business process diagrams, where nowadays BPMN pre-
vails, process landscape diagrams lack standardization, which results in numerous
process landscape designs. Accordingly, our goal was to investigate how intuitive
are current landscape designs to users with low expertise, as well as users having
expertise in BPMN and landscape modeling. A total of 302 subjects participated
in the research showing that previous expertise impacts the interpretation of land-
scape elements and designs whereas, in the case of having contextual information,
subjects responded more consistently. The results also show that the basic rela-
tionships between processes are intuitive to users, also in the case when only
proximity between shapes is facilitated. Our findings may imply future designs of
languages for process landscapes. They also may be useful for those who actually
model process landscape diagrams and search for suitable notations.
1 Introduction
process landscape in order to measure the overall organization’s potential [3]. A pro-
cess landscape represents processes as ‘black-boxes’ and so focuses on interrelationships
between processes and external participants. In this manner, a process landscape enables
an organization to maintain an overview of processes, which simplifies process-related
communication and may represent a starting point for process discovery.
Accordingly, a process landscape model has to be comprehensible by all major
stakeholders of an organization [2, 3]. This implies the usage of a common, compact,
and intuitive language for the creation of process landscape diagrams. However, no
standardized languages for creating process landscapes exist [4], whereas BPMN 2.0
does not cover the wide landscapes and complexities that exist in the process-modeling
domain [5, 6]. Consequently, organizations, as well as process modeling tool vendors
(e.g., ARIS Express, Visual Paradigm, Vizi Modeler and Signavio), define their own
‘overviews of processes’ most commonly by imitating ‘value chain’ diagrams.
As a result, landscape diagrams differ from each other, and while there is no com-
mon landscape modeling language, an inexperienced user could infer a different meaning
from the appearance of a language element, which could negatively impact the compre-
hension of a diagram and the corresponding decisions made. And while the graphical
representation significantly impacts the cognitive effectiveness of a diagram [7–9], it
is important to specify a common palette of comprehensible symbols fitting with the
process landscapes domain.
According to these challenges, the main goal of our work was to investigate the intu-
itiveness of the representations of process landscape designs as found in academia and
industry, i.e., to test if representations of landscape concepts are intuitive (i.e. semanti-
cally transparent, clear) to people with ‘near-to-zero’ knowledge of a process landscape
design. In this light, we defined the following research questions which could be tested
empirically:
2 Research Background
2.1 Process Landscapes
A high-level model of an organization that represents an overall structure of business
processes and their relationships, emerged as a tool to aid process-oriented companies
in managing large business process collections [10]. With roots in the early 1980s,
An Empirical Investigation of the Intuitiveness of Process Landscape Designs 211
when Porter [11] introduced the value chain model, the concept is commonly specified
as a ‘process landscape’ and represents a set of interconnected processes within an
organizational system. Alternative terms in use are ‘process overview’ [12], and ‘process
map’. However, according to the findings of Poels et al. [13], the term ‘process map’
may either represent a model of a business process architecture or an entry-level model
of a business process model architecture.
A process landscape model (Fig. 1) shows the structure, grouping, modularity, func-
tionality, and technology of chain processes, business processes, and working processes.
In contrast to business process models, processes on the landscape level are modeled as
‘black-boxes’ whose internal complexity is hidden for the sake of simplicity and clarity.
Process landscape diagrams may be used in numerous ways, addressing the con-
cerns of business-oriented users as well as technically-oriented ones [12]. While being
specified on the macro level, they provide a comprehensive understanding and high-
light different types of relationships or dependencies with other processes and artifacts
[14]. Process landscape diagrams help process owners, quality managers, and other
process-related stakeholders to ease the maintenance of their processes by offering a
quick overview of processes. Afterward, in detailed process diagrams, individual busi-
ness processes may be decomposed into finer levels of detail (i.e., sub-processes and
tasks). In summary, like modeling individual processes is a starting point for any process
improvement effort, modeling the architecture of an organization’s collection of business
processes is required for any analysis, design or improvement effort that transcends the
level of individual processes [13].
Figure 1 represents two common process landscape diagrams, with processes
depicted as chevron arrows (left) or rectangles (right), whereas arrows represent between-
process relationships. The left diagram in Fig. 1 additionally connects organizational
processes with the environment by specifying connections to external participants (i.e.,
stakeholders).
diagrams with hidden details; (2) use of BPMN Collaboration diagrams and (3) use of
Enterprise-wide BPMN Process diagrams.
An analysis performed in [18] demonstrates that none of the BPMN approaches
results in diagrams with a graphical similarity to common landscape diagrams (e.g.,
value chains). Analytically, this was confirmed by Malinova et al. [19], who performed
a semantical mapping between BPMN and ‘Process maps’. Their results show that
BPMN in its current form is not appropriate for process landscape design.
2.3 Semiotics
Fig. 2. Main concepts as specified in semiotics (left) and OMG’s namespace (right)
Based on the relation between the signifier and signified, semiotics defines three
types of signs: (1) icon, where a signifier physically resembles the signified (i.e. person
sign on Fig. 2); (2) symbol, where the signifier presents the signified with an arbitrary
or conventional relation; and (3) index, where the signifier is related to the signified
by an associative relation (i.e. Fig. 2, right – the darker symbol supports the lighter
one (from below), which is analogous to common real-life situations). In the process
languages’ space (e.g. BPMN, CMMN, DMN), a sign is commonly referred to as an
element, whereas the signifier is commonly referred to as a depiction of an element [21].
The definition of a process element has an equal meaning as a ‘signified’ in semiotics,
meaning the specification of a language concept. Since the focus of our investigation is on
process languages, we will use the terms according to the process languages namespace,
i.e. a ‘process element consists of its definition and depiction’.
Caire et al. [22] stated that “The key to designing visual notations that are understandable
to naïve users is a property called semantic transparency”, which means that the mean-
ing (semantics) of a sign is clear (i.e. intuitive, transparent) from its appearance alone
An Empirical Investigation of the Intuitiveness of Process Landscape Designs 213
3 Empirical Research
Since we investigated intuitiveness, the ideal candidate for the research would be an
individual who (1) understands the meaning of the concepts, which are used in land-
scape modeling, yet has (2) no experiences with the corresponding landscape modeling
notations. According to this, IT and business students of the same degree were selected
as suitable candidates for the research.
The focal research instrument was an online questionnaire, which was categorized into
the following parts. In the first part, subjects were asked to provide basic demographic
information (age, gender), and their experiences in BPMN as well as in landscape mod-
eling (both measured on a 7-point Likert scale from lowest to the highest degree of
experience, and self-reported number of modeled diagrams). In the second part, subjects
were introduced by alternative depictions of common landscape elements (i.e. landscape
elements as used in academia and industry, including BPMN and Archimate), where they
were asked to associate the most appropriate meaning to them (including the ‘undecided’
answer). In addition, partial diagrams were presented to subjects to test if they would
more effectively infer the meaning if using a diagram’s contextual information. To min-
imize learning effects, the individual items as well the answers were randomized. In the
third part of the questionnaire, a “two-treatments” alike design was applied to test the
alternative notations used in landscaped modeling. Due to the paper’s length limitations,
this part was excluded from this paper. The instrument was prepared in Slovenian and
English version and was completely anonymized. The actual research was performed
in January 2019. In total, 588 subjects were invited to participate, 347 subjects actually
opened the questionnaire or partially completed it, whereas 302 subjects successfully
completed the questionnaire. Out of them, 65% of the subject came from Slovenia,
whereas 35% came from Ukraine.
4 Results
The results were collected and partially analyzed in 1KA (https://www.1ka.si/d/en), an
advanced open-source application that enables services for online surveys. Afterward,
the data was exported into MS Excel as well as SPSS, to perform additional analysis.
An Empirical Investigation of the Intuitiveness of Process Landscape Designs 215
As evident from Table 1, the sum of all levels of expertise does not match all subjects,
since those subjects who specified ‘undecided’ on their level of expertise were not
classified into any level of expertise.
The symbol for the collapsed processes collection (D6) was correctly identified by
BPMN experts, which may have roots in an analogous representation of a collapsed
BPMN subprocess. Expanded collections of processes (D9 and D10) haven’t been iden-
tified correctly. However, this may have roots in the research instrument since subjects
focused on the relationships between processes present on the collection instead of the
collection itself.
Individual landscape elements, as well as the relationships between them, were addi-
tionally investigated by considering contextual information, i.e. by putting elements
into (partial diagrams). Initially, subjects were asked to identify the type of process by
providing them a simple value chain-based landscape diagram (Fig. 3).
In this case, subjects recognized the core process correctly in 73%, whereas the
supporting process was recognized correctly by 68% of subjects. This is a significant
increase when compared to the individually investigated symbols (17%, Table 2, D3). In
the case of the management process, the success rate was 59%, whereas in the individual
investigation it was 14% (Table 2, D5).
The focus of the left diagram in Fig. 4 was to investigate the relationships between
processes as specified by Eid Sabbagh et al. [31], namely composition, specializa-
tion, trigger, and information flow, with the last two being specified as behavioral ones
(Table 3, the highest values are bolded). While two symbols may share the same meaning
in praxis which stands for ‘symbol overload’ (e.g. solid line and arrow, different types
of arrows), we did specify any correct definitions in this case.
Table 3 reveals that subjects responded the most consistently (in respect to different
levels of expertise) in the case of a conditional trigger relationship (D11), which may be
associated with the intuitiveness of a diamond shape symbol, commonly representing a
decision-point. Information flow (D12) was correctly recognized by experts, which may
be related to the fact that the depiction is equal to BPMN’s message flow. In a similar
manner, the ‘generic-specific’ relationship (D13) was correctly identified by the experts,
who may have the knowledge either of UML class diagrams or Archimate. The sequential
relationship (D14) was not correctly recognized by inexperienced users, whereas all other
An Empirical Investigation of the Intuitiveness of Process Landscape Designs 219
expertise levels including all answers inferred the correct meaning. The answers were
also inconsistent in the case of a solid line, whereas the majority of subjects reported as
being a ‘generic-specific’ relationship (i.e. as common in organizational charts).
The focus of the right diagram in Fig. 4 was to investigate the implicit relationships
between processes, which commonly occur on a landscape diagram, especially value-
chain based. In this manner, subjects were asked to specify the relationships between
the processes sharing the same letter and color (Table 4).
Independent
Sequentially
Undecided
Element set
Independ-
Undecided
Parallel
Valid
Parallel
Valid
ent
5 Conclusions
Based on the results which were presented and discussed in the previous chapter, we
may provide the answers to the stated research questions as follows.
220 G. Polančič et al.
Symbol
RQ2: How does the previous knowledge impacts the comprehension of process landscape
designs?
Two types of previous knowledge were investigated in our research: experiences with
BPMN and experience with landscape modeling (Table 1). The results of our investi-
gation show that by considering previous knowledge subjects responded differently in
several cases when compared to ‘inexperienced’ subgroup of subjects. These were most
evident in the cases of a rectangle (Table 2, D4) and chevron arrow with a plus sign
(Table 2, D6). Table 6 summarizes these by comparing the comprehension of process
landscape elements by considering their definitions as specified or used in praxis.
Besides, the connection elements, which depictions are mainly specified in a conven-
tional way (they do not have any ‘built-in mnemonics’) have reported different compre-
hension levels, when considering different levels of expertise. E.g. in the case of BPMN
experts a dotted arrow was successfully associated with an information flow (in BPMN
it represents a Message flow), whereas they successfully associated solid arrow with a
triggering relationship (in BPMN it represents a Message flow).
An Empirical Investigation of the Intuitiveness of Process Landscape Designs 221
"undecided"
(ArchiMate)
(ArchiMate)
ParƟcipant
Document
CollecƟon
CollecƟon
collecƟon
(chevron)
(chevron)
Database
ExperƟse
Average
Support
process
process
Process
Process
Process
Mngm.
I 29% 18% 12% 27% 12% 5% 59% 85% 1% 4% 27%
L 39% 25% 29% 20% 25% 22% 88% 92% 2% 2% 21%
B 31% 25% 19% 19% 25% 13% 69% 88% 6% 0% 21%
Expertise: I=inexperienced; B= BPMN expert; L= landscapes modeling expert;
5.1 Implications
The results of this research should be considered with the following internal and external
limitations in mind. With respect to the external validity, there is a certain degree of risk
of generalizing results above the research sample. While students reported as not being
skilled in BPMN and landscape modeling languages, another group of subjects could
provide different results (e.g. subjects from another environment could be impacted
by other signs in their everyday life). Besides, the sample of subjects experienced in
landscape design was rather small (16 subjects). Secondly, there is also a certain degree
of risk associated with the instrument, where the subject may not be able to correctly
interpret the depictions as well the semantics of the symbols out of the instructions (e.g.
as in the case of expanded process collections).
Our future work will be focused on specifying a modified landscape modeling nota-
tion, based on these results and test if the resulting diagrams are more cognitively effective
when compared to existing ones. Besides, we may extend the research to other regions
to test on how cultural differences may impact the intuitiveness of symbols.
Acknowledgment. The authors (Gregor Polančič) acknowledge the financial support from the
Slovenian Research Agency (research core funding No. P2-0057).
222 G. Polančič et al.
References
1. Dijkman, R., Vanderfeesten, I., Reijers, H.A.: Business process architectures: overview,
comparison and framework. Enterp. Inf. Syst. 10, 129–158 (2016). https://doi.org/10.1080/
17517575.2014.928951
2. Dumas, M., Rosa, M.L., Mendling, J., Reijers, H.: Fundamentals of Business Process
Management. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-662-56509-4
3. Becker, J., Pfeiffer, D., Räckers, M., Fuchs, P.: Business process management in public
administrations-the PICTRUE approach. In: PACIS 2007 Proceedings, p. 142 (2007)
4. Malinova, M., Leopold, H., Mendling, J.: An explorative study for process map design. In:
Nurcan, S., Pimenidis, E. (eds.) CAiSE Forum 2014. LNBIP, vol. 204, pp. 36–51. Springer,
Cham (2015). https://doi.org/10.1007/978-3-319-19270-3_3
5. Van Nuffel, D., De Backer, M.: Multi-abstraction layered business process modeling. Comput.
Ind. 63, 131–147 (2012). https://doi.org/10.1016/j.compind.2011.12.001
6. von Rosing, M., von Scheel, H., Scheer, A.-W.: The Complete Business Process Handbook:
Body of Knowledge from Process Modeling to BPM, Volume I: Body of Knowledge from
Process Modeling to BPM, vol. 1. Morgan Kaufmann, Waltham (2014)
7. Larkin, J.H., Simon, H.A.: Why a diagram is (sometimes) worth ten thousand words. Cogn.
Sci. 11, 65–100 (1987)
8. Siau, K.: Informational and computational equivalence in comparing information modeling
methods. JDM 15, 73–86 (2004). https://doi.org/10.4018/jdm.2004010103
9. Zhang, J., Norman, D.: Representations in distributed cognitive tasks. Cogn. Sci. 18, 87–122
(1994)
10. Gonzalez-Lopez, F., Bustos, G.: Business process architecture design methodologies – a
literature review. Bus. Process Manag. J. (2019). https://doi.org/10.1108/BPMJ-09-2017-
0258
11. Porter, M.E.: Competitive Advantage: Creating and Sustaining Superior Performance. Free
Press; Collier Macmillan, New York, London (1985)
12. Gonzalez-Lopez, F., Pufahl, L.: A landscape for case models. In: Reinhartz-Berger, I.,
Zdravkovic, J., Gulden, J., Schmidt, R. (eds.) BPMDS/EMMSAD -2019. LNBIP, vol. 352,
pp. 87–102. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20618-5_6
13. Poels, G., García, F., Ruiz, F., Piattini, M.: Architecting business process maps. Comput. Sci.
Inf. Systems. (2019). https://doi.org/10.2298/csis181118018p
14. Stefanov, V., List, B., Schiefer, J.: Bridging the gap between data warehouses and business
processes: a business intelligence perspective for event-driven process chains. In: Ninth IEEE
International EDOC Enterprise Computing Conference, EDOC 2005, pp. 3–14. IEEE (2005)
15. Weske, M.: Business Process Management: Concepts, Languages, Architectures. Springer,
Heidelberg (2019)
16. Dijkman, R., Vanderfeesten, I., Reijers, H.A.: The Road to a Business Process Architec-
ture: An Overview of Approaches and Their Use. Einhoven University of Technology, The
Netherlands (2011)
17. Muehlen, M.Z., Ho, D.T.: Service process innovation: a case study of BPMN in practice. In:
Hawaii International Conference on System Sciences, Proceedings of the 41st Annual. p. 372
(2008). https://doi.org/10.1109/HICSS.2008.388
18. Polančič, G., Huber, J., Tabares, M.S.: An analysis of BPMN-based approaches for process
landscape design [Elektronski vir]. Presented at the Gregor Polančič, Jernej Huber, Marta S.
Tabares (2017)
19. Malinova, M., Mendling, J.: Why is BPMN not appropriate for Process Maps? In: ICIS 2015
Proceedings. (2015)
20. Chandler, D.: Semiotics: The Basics. Routledge, London; New York (2007)
An Empirical Investigation of the Intuitiveness of Process Landscape Designs 223
21. OMG: Business Process Model and Notation version 2.0, http://www.omg.org/spec/BPMN/
2.0/. Accessed 15 Mar 2011
22. Caire, P., Genon, N., Heymans, P., Moody, D.L.: Visual notation design 2.0: towards user com-
prehensible requirements engineering notations. In: 2013 21st IEEE International Require-
ments Engineering Conference (RE), pp. 115–124 (2013). https://doi.org/10.1109/RE.2013.
6636711
23. Petre, M.: Why looking isn’t always seeing: readership skills and graphical programming.
Commun. ACM 38, 33–44 (1995). https://doi.org/10.1145/203241.203251
24. Britton, C., Jones, S.: The untrained eye: how languages for software specification support
understanding in untrained users. Hum.-Comput. Interact. 14, 191–244 (1999). https://doi.
org/10.1080/07370024.1999.9667269
25. Britton, C., Jones, S., Kutar, M., Loomes, M., Robinson, B.: Evaluating the intelligibility
of diagrammatic languages used in the specification of software. In: Anderson, M., Cheng,
P., Haarslev, V. (eds.) Diagrams 2000. LNCS (LNAI), vol. 1889, pp. 376–391. Springer,
Heidelberg (2000). https://doi.org/10.1007/3-540-44590-0_32
26. Hruby, P.: Structuring specification of business systems with UML (with an emphasis on
workflow management systems). In: Patel, D., Sutherland, J., Miller, J. (eds.) Business Object
Design and Implementation II, pp. 77–89. Springer, London (1998). https://doi.org/10.1007/
978-1-4471-1286-0_9
27. Neiger, D., Churilov, L., Flitman, A.: Business process modelling with EPCs. In: Neiger,
D., Churilov, L., Flitman, A. (eds.) Value-Focused Business Process Engineering: A Systems
Approach. ISIS, vol. 14, pp. 1–31. Springer, Boston (2009). https://doi.org/10.1007/978-0-
387-09521-9_5
28. Polančič, G., Šumak, B., Pušnik, M.: A case-based analysis of process modeling for public
administration system design. Inf. Model. Knowl. Bases XXXI 321, 92 (2020)
29. Recker, J.: Continued use of process modeling grammars: the impact of individual difference
factors. Eur. J. Inf. Syst. 19, 76–92 (2010)
30. Christensen, L.B., Johnson, B., Turner, L.A., Christensen, L.B.: Research methods, design,
and analysis (2011)
31. Eid-Sabbagh, R.-H., Dijkman, R., Weske, M.: Business process architecture: use and cor-
rectness. In: Barros, A., Gal, A., Kindler, E. (eds.) BPM 2012. LNCS, vol. 7481, pp. 65–81.
Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32885-5_5
Requirements and Method Engineering
(EMMSAD 2020)
A Multi-concern Method for Identifying
Business Services: A Situational Method
Engineering Study
O. Ege Adali(B) , Oktay Türetken, Baris Ozkan, Rick Gilsing, and Paul Grefen
1 Introduction
The most recent business approaches are increasingly shifting their focus away from
goods-thinking to services-thinking [1]. Driven by the influence of digitalization,
increased connectivity, and global economy, the concept of service has become central to
value-creation [2]. As a result, many organizations are providing services as first-class
standalone offerings in their value propositions, whereas many others are enhancing
their offerings by transforming their products into services through servitization [3].
In this context, one major challenge for such organizations is the identification of
their service offerings. When identifying service offerings, organizations have to deal
with various service provisioning issues, such as determining what can be offered to
which existing and potential customers and business partners [4–8], alignment of service
offerings with the long-term strategic interests of the organization [9], and identification
of business capabilities to provide a specific service offering [10–12]. To address this
broad range of concerns, scholars have proposed the concept of ‘business service’, and
relevant business service identification methods (BSIMs).
© Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 227–241, 2020.
https://doi.org/10.1007/978-3-030-49418-6_15
228 O. Ege Adali et al.
Business services are engineering artifacts designed by service providers with the pur-
pose of achieving their strategic goals [4–6]. In that sense, the design of a business ser-
vice involves bringing specificity to resources which have the potential to be acquired
by specific customers or customer segments [16]. All in all, business services repre-
sent different types of value propositions in the form of offerings that service providers
A Multi-concern Method for Identifying Business Services 229
expose to advertise and manage their resources and interactive processes [11]. As is the
case with the parent term service, the literature provides many different definitions of
the term business service. Each definition caters to a set of design concerns intrinsic to
the context in which the term is used. As a result, this creates an ambiguity surrounding
the term and this ambiguity represents a challenge to the investigation of how business
services are identified and defined. Therefore, as a first step in the purpose of designing
a BSIM, we have studied the definitions of the concept of business service in the schol-
arly literature. These definitions bring together various design concerns related to the
concept.
To discover these associated design concerns -which we will refer to as business
service concerns from this point on-, we conducted a literature review by performing
searches on established scientific databases (Ebsco, ScienceDirect, Scopus Springer-
Link, Web of Science, and Wiley) using the search string “business service”. This resulted
in an initial set of 104 studies. Next, we applied backward snowballing on these studies
[17]. The main inclusion criterion for selecting studies from the start set was that the
study should propose a definition for the term ‘business service’. We conducted two
iterations and our final list of studies included a total of 16 studies. The references of
these studies can be found in Table 1 below. We extracted each definition introduced by
the aforementioned 16 studies and applied a grounded theory approach [18] to extract
and refine a complete set of business service concerns. Accordingly, we first coded the
business service concerns that each definition adheres to and then translated these into
overarching themes by applying axial coding [19]. The resulting themes with the sources
of definitions is presented in Table 1.
The aim of Business service identification is to identify and define candidate business
services on the basis of business service concept [9]. Since the concept of business
service involves multiple concerns, service identification involves the consideration of
these concerns as well. This requires a systematic method that supports the examination
of organizations from multiple perspectives [13]. In order to address this need, scholars
and practitioners proposed several BSIMs. These methods use different techniques that
involve procedures focusing on a key business artifact (e.g., business processes, goals,
business functions, features) and a set of activities that employ this artifact in identifying
business services [13]. However, reviews conducted on these BSIMs conclude that,
while BSIMs recognize the business service concerns to a certain degree, there is no
consensus on how to deal with multiple business service concerns in a systematic way
[13]. Comparing the previous work on BSIMs, we identified 47 unique BSIMs1 which
populate our method base.
3 Research Design
Aligned with the most contemporary perspective to marketing S-D logic [2] we define
the context of our project as “identification of business services for a service provider
making a value proposition to a business network to co-create value with other actors
in the same network”. Furthermore, the service provider’s major concern is to leverage
the business service concept to the fullest in terms of addressing 11 business service
concerns. In the light of this context information, an analysis of our method base revealed
that the present methods provided method chunks to perform certain aspects of business
service identification aligned with our context such as consideration of network or value
propositions during identification. Additionally, we observed the existence of at least
one method that adheres to a specific business service concern. Therefore, we decided
to follow a method driven strategy and set our method engineering goal as “assembling
a new method by re-using the method chunks of existing BSIMs”.
1 Please visit the following link for the method base: https://sites.google.com/view/bsimbase/.
A Multi-concern Method for Identifying Business Services 231
# Method intention
MEI1 The method should identify business service(s) that comply with 11 business service
concerns
MEI2 The method should target a focal business unit (FBU) to specify business service(s)
MEI3 The method should identify business services based on a value proposition of the FBU
MEI4 The method should be applied in a business network consisting of actors defined in the
value proposition
– S1: Application domain = ‘Information Systems’ & Design activity = ‘Model the
Value Co-Creation Context’ & Situation = ‘Business Services’ & Intention = ‘Identify
goal dependencies between actors’
– S2: Application domain = ‘Information Systems’ & Design activity = ‘Capture the
Core Business’ & Situation = ‘Business Services’ & Intention = ‘Map business
capabilities of the focal business unit’
– S3: Application domain = ‘Information Systems’ & Design activity = ‘Relate the
Core Business to the Value Creation Context’ & Situation = ‘Business Services’ &
232 O. Ege Adali et al.
Intention = ‘Map business capabilities of the focal business unit to goals of the value
creation actors’
– S4: Application domain = ‘Information Systems’ & Design activity = ‘Design Ser-
vice Specifications’ & Situation = ‘Business Services’ & Intention = ‘Create service
specifications’
Each BSIM residing in our method base is labeled with specific tags indicating its
domains, design activity, situation and intention. Therefore, we utilized these tags for
querying the method chunks meeting the requirements of each strategy. Our queries
resulted in selection of a total of 13 method chunks (MCs). The resulting set of MCs for
each query is given in Table 3.
4 MCBSIM
The method, which consists of four main steps as presented in Fig. 2, has been designed
to be used in a context where a focal business unit (FBU) makes a value proposition to
co-create value with the actors (including customers and other stakeholders) in a certain
context. Accordingly, the initial input for the method is a value proposition that the FBU
aims to make in a certain value co-creation context.
Goal Business
3. Relate the Core Capability Service
Matching Specifica on(s)
Business Capabili es to 4. Create
the Value Crea on Service Concepts
Context
Fig. 2. MCBSIM
In Step 1, the objective is to model the value co-creation context in terms of deter-
mining goals and means dependencies between the value co-creation actors. The chosen
method chunk for carrying out this step MC1: i* Strategic Dependency Model [47] that
supports four main modelling concepts:
• Actor: A business role (e.g., organization and customer) that carries out actions to
achieve goals by exercising its knowhow. We refer to an actor as a Value Co-creation
Actor in this body of work.
• Goal: A desirable business state an actor aims to reach or sustain.
• Means: A concrete course of action (task) taken to accomplish goals. The realization
of a means is under the control of the actor who proposes (owns) the means.
• Dependency: A link between two actors indicating that one actor (depender) depends
on the other (dependee) for something in order that the former may attain some goal.
Two types of dependencies are considered:
– Goal Dependency: T he depender depends on the dependee to bring about a certain
state in the world. The dependee is given the freedom to choose how to do it.
– Means or Task-dependency: The depender depends on the dependee to carry out an
activity. A task dependency specifies how the task is to be performed, but not why.
The output of this step is a set of goal models: a generic model of the whole context, and
goal models focusing on one-to-one goals and means dependencies between the FBU
and each party in the context. It should be noted that the goal models are relative to the
value co-creation context.
In Step 2, the main objective is to focus on the FBU and determine the business
capabilities of the FBU, which contribute in making the selected value proposition.
A Multi-concern Method for Identifying Business Services 235
Various definitions of capabilities exist in the literature [48], however, we adopted the
definition provided in [49] as it is a synthesis of the definitions provided in the literature.
Accordingly, a capability (1) is possessed by a resource or resource groups of resources
(tangible and intangible), (2) is the potential for action via a process, and (3) produces
a value for a customer (internal/external).
To capture such capabilities, we used the template given in Fig. 5 which was adapted
from [49]. The method chunk for carrying out this step MC8: Capability Modeling [31]
captures capabilities by defining service domains and identifying capabilities that exist
in a specific service domain [31]. A service domain is described as a sphere of control
that contains a collection of service operations to achieve related goals [31]. The service
operations are the activities that are carried out within the service domain to interact
with other service domains [50].
In Step 3, the main objective is to determine the capabilities that enable the achieve-
ment of identified goals and means. As described in MC8: Capability Modeling [31],
the focus of the determination activity is to match the delivered business outcome of
each capability with one or more goals and processes (or activities in the processes)
to the means. After the matching, the capabilities that are necessary to make the value
proposition are identified. These capabilities are the main output of this stage.
In Step 4, the achieved outputs are combined and processed for specification of
the business services as described in MC8: Business Service Specification [31]. For
specification, we used the template given in Fig. 7 which was designed in accordance
with 11 business service concerns. As depicted in template, a business service specifi-
cation involves a set of business service attributes, such as the capability, owner, service
operations, used resources, etc.
5 Demonstration
In this section, we demonstrate the utility of our method by applying it in an illustrative
scenario that is based on a real-life business case. The scenario depicts the case for
a new urban bike sharing business model. We omit the names of the organizations in
the scenario to keep their anonymity. In the selected business model, there are four
actors: traveler (e.g., tourists, students, employees), bike sharing service provider, bike
maintenance provider, and the local municipality. The co-created value proposed for the
traveler is flexible and comfortable travelling experience via cycling around the city.
Thus, the value to be co-created with the business model encapsulates high availability
and widespread coverage of bicycles within the city. This should allow traveler, whenever
s/he desires, to take a bicycle and travel around the city. The traveler is not concerned
with managing or maintaining the bicycle and can store the bicycle at any available slot
at a parking station. As such, flexibility and comfort should be granted to the travelers.
How each actor contributes to the value co-creation is described below:
Traveler-being the customer-contributes to the value co-creation through providing
data on the usage of the service. Therefore, the value proposition of the traveler is pro-
file data and data about service-use. Bike Sharing Service Provider contributes to the
value co-creation by providing the facilities for bike sharing. As such, it is responsi-
ble for establishing the infrastructure for the bicycles, the software system to operate
236 O. Ege Adali et al.
and use the bicycles, as well as the IT system to interact with users. Bike Sharing Ser-
vice Provider is the focal business unit (FBU) of the business case. Bike Maintenance
Provider contributes to the value co-creation by ensuring that bikes are in good condi-
tions and available for travelers wherever and whenever they desire. Local Municipality
contributes to the value co-creation by providing either the legal, financial or operational
support.
In step 1, we regarded the value co-creation contribution of each actor (except the
customer) as that actor’s motivations and interests [47] and iteratively dissected these
contributions into goals and means. On the other hand, the goal(s) and the single mean of
the customer were defined by dissecting the main characteristic of the co-created value
which is flexible travel. For space considerations, we only provide the goal model (Fig. 3)
that depicts the goals and means dependencies between the FBU (service provider) and
customer (traveler).
CG1 BSG1
The travel shall Travelers shall be
be flexible given bikes
Bike
Get a BSG2 Sharing
Offer Bike Bike availability
Traveler bike informa on shall be Service
available
CG1.1 Provider
Offer bike
Bike shall be BSG3
availability Bike shall be
available
indica ons maintained
Bike shall CG1.2
always be at a BSG4
Bikes shall be locked
good state with smart locks
CG1.3 Offer bike
Bike shall be
locking/unlocking BSG5
locked/unlocked Bike loca ons shall be
tracked
Bike Loca on CG1.4
Track bike BSG6
Informa on shall
loca ons Informa on about Bike
be available
bike drop-off point Maintena
Bike shall be CG1.5 Offer bike shall be available
dropped at the drop-off -nce
drop-off point points Provider
Fig. 3. i* goal model for value co-creation context (between service provider and traveler)
In step 2, we determined the service domains and service operations of the FBU
based on the means of the FBU defined in goal models. Then, we matched these service
operations to an already existing list of business capabilities of the FBU which are
shown in Fig. 4. Furthermore, we re-defined each capability in detail according to our
template. Figure 5 presents an example that depicts a specification for the capability
“bike lending”.
In step 3, we explicitly linked the capabilities that enable the FBU to achieve the goals
defined in Step 1. Then, we examined the linked capabilities against 11 business service
concerns to identify capabilities that can be business service candidates. This resulted in
selection of the set of capabilities: Bike Lending and Traveler Guidance (as depicted in
Fig. 6). The capability: Service Platform Management is an enabling capability for the
other capabilities, and it is not provided to the value co-creation context (i.e. does not
comply with concern C4), therefore, is not a candidate business service. Furthermore,
the capabilities Bike Maintenance and Accident Handling are partner capabilities that
do not belong to the FBU.
A Multi-concern Method for Identifying Business Services 237
End Session
Traveler Guidance Suggest Station
Share Avail. Information
Bike Availability Maintain Bikes (Predictive) X
Platform Management Manage Data
Secure Platform X
Analyze Data X
Provide Analytics
Enabling Capability 4:
Partner Capability 5:
Service Platform
Accident Handling
Management
In step 4, we specified two business services for 2 FBU capabilities: Bike Lending
and Traveler Guidance by bringing together all the entities and properties of a business
service as defined by our 11 business service concerns. A specification for the Bike
Lending business service is presented in Fig. 7.
This study is subject to potential limitations mainly due to the strategy used to
demonstrate the utility of the proposed method. A demonstration with an illustrative
scenario is usually tailored to an ideal context and thus, is highly prone to hinder the
discovery of issues that might result from the use of the artifact at hand in a real setting
or context [51]. Since the proposed method and particularly its steps are highly rooted
in academic literature –mainly in the form of business service concerns and existing
BSIMs-, its effects on a real-world situation are yet to be discovered. Therefore, as
future work, the method can be applied in a number of real-life scenarios in the form
of case studies, and its utility and validity can be further evaluated using qualitative
research methods that involve the practitioners as users of the method. Accordingly, the
method can be improved and finetuned to address any potential shortcomings discovered
in these evaluations.
References
1. Plugge, A., Janssen, M.: Exploring determinants influencing a service-oriented enterprise
strategy: an executive management view BT - digital services and platforms. Considerations
for Sourcing, pp. 35–55 (2019)
2. Vargo, S.L., Lusch, R.F.: Evolving to a new dominant logic for marketing. J. Mark. 68(1),
1–17 (2004)
3. Wolfson, A., Dominguez-Ramos, A., Irabien, A.: From goods to services: the life cycle
assessment perspective. J. Serv. Sci. Res. 11(1), 17–45 (2019)
4. Cherbakov, L., Galambos, G., Harishankar, R., Kalyana, S., Rackham, G.: Impact of service
orientation at the business level. IBM Syst. J. 44(4), 653–668 (2005)
5. Sanz, J., Nayak, N., Becker, V.: Business services as a modeling approach for smart business
networks (2006)
6. Tohidi, H.: Modelling of business services in service oriented enterprises. Procedia Comput.
Sci. 3, 1147–1156 (2011)
7. Flaxer, D., Nigam, A.: Realizing business components, business operations and business
services. In: IEEE International Conference on E-Commerce Technology for Dynamic E-
Business, pp. 328–332 (2004)
8. Brocke, H., Uebernickel, F., Brenner, W.: A methodical procedure for designing consumer
oriented on-demand IT service propositions. Inf. Syst. E-bus. Manag. 9(2), 283–302 (2011)
9. Arsanjani, A., Ghosh, S., Allam, A., Abdollah, T., Ganapathy, S., Holley, K.: SOMA: a method
for developing service-oriented solutions. IBM Syst. J. 47(3), 377–396 (2008)
10. Lusch, R.F., Nambisan, S.: Service innovation in the digital age service innovation: a service-
dominant logic perspective. MIS Q. 39(1), 155–176 (2015)
11. Turetken, O., Grefen, P., Gilsing, R., Adali, O.E.: Service-dominant business model design
for digital innovation in smart mobility. Bus. Inf. Syst. Eng. 61(1), 9–29 (2019)
12. Suratno, B., Ozkan, B., Turetken, O., Grefen, P.: A method for operationalizing service-
dominant business models into conceptual process models. In: Shishkov, B. (ed.) BMSD
2018. LNBIP, vol. 319, pp. 133–148. Springer, Heidelberg (2018). https://doi.org/10.1007/
978-3-319-94214-8_9
13. Huergo, R.S., Pires, P.F., Delicato, F.C., Costa, B., Cavalcante, E., Batista, T.: A systematic
survey of service identification methods. Serv. Oriented Comput. Appl. 8(3), 199–219 (2014)
14. Ralyte, J., Deneckere, R., Rolland, C.: Towards a generic model for situational method engi-
neering. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681, pp. 95–110. Springer,
Heidelberg (2003). https://doi.org/10.1007/3-540-45017-3_9
240 O. Ege Adali et al.
15. Iacovelli, A., Souveyet, C., Rolland, C.: Method as a Service (MaaS). In: 2008 Second
International Conference on Research Challenges in Information Science, pp. 371–380 (2008)
16. Arnould, E.J.: Service-dominant logic and resource theory. J. Acad. Mark. Sci. 36(1), 21–24
(2008)
17. Wohlin, C.: Guidelines for snowballing in systematic literature studies and a replication in
software engineering. In: Proceedings of the 18th International Conference on Evaluation and
Assessment in Software Engineering, pp. 38:1–38:10 (2014)
18. Corbin, J., Strauss, A.: Basics of Qualitative Research: Techniques and Procedures for
Developing Grounded Theory, 3rd edn. Thousand Oaks, California (2008)
19. Charmaz, K.: The search for meanings - grounded theory. In: Rethinking Methods in
Psychology, pp. 27–49. Sage Publications, London (1996)
20. Estrada, H.: A service-oriented approach for the i* framework. Universidad Politecnica de
Valencia (2008)
21. Flaxer, D., Nigam, A., Vergo, J.: Using component business modeling to facilitate business
enterprise architecture and business services at the US Department of Defense. In: IEEE
International Conference on e-Business Engineering (ICEBE 2005), pp. 755–760 (2005)
22. Nayak, N., Nigam, A., Sanz, J., Marston, D., Flaxer, D.: Concepts for service-oriented business
thinking. In: Proceedings - 2006 IEEE International Conference on Services Computing, SCC
2006, pp. 357–364 (2006)
23. Tians, C., Ding, W., Cao, R., Lee, J.: Business componentization: a guidance to application
service design. In: Min Tjoa, A., Xu, L., Chaudhry, S.S. (eds.) Research and Practical Issues
of Enterprise Information Systems, pp. 97–107. Springer US, Boston (2006). https://doi.org/
10.1007/0-387-34456-x_10
24. Sanz, J., et al.: Business services and business componentization: new gaps between business
and IT. In: IEEE International Conference on Service-Oriented Computing and Applications
(SOCA 2007), pp. 271–278 (2007)
25. Estrada, H., Martínez, A., Santilí An, L.C., Erez, J.: A new service-based approach for
enterprise modeling (2013)
26. Karakostas, B., Zorgios, Y., Alevizos, C.C.: The semantics of business service orchestration.
In: Eder, J., Dustdar, S. (eds.) BPM 2006. LNCS, vol. 4103, pp. 435–446. Springer, Heidelberg
(2006). https://doi.org/10.1007/11837862_41
27. Cartlidge, A., Hanna, A., Rudd, C., Macfarlane, I., Windebank, J., Rance, S.: An introductory
overview of ITIL V3 (2007)
28. Böttcher, M., Klingner, S.: Providing a method for composing modular B2B services. J. Bus.
Ind. Mark. 26(5), 320–331 (2011)
29. Kohlborn, T., Fielt, E., Korthaus, A., Rosemann, M.: Towards a service portfolio management
framework. In: Proceedings of 20th Australasian Conference on Information Systems, pp. 1–
12 (2009)
30. Ralyté, J., Rolland, C.: An assembly process model for method engineering. In: Dittrich,
K.R., Geppert, A., Norrie, M.C. (eds.) CAiSE 2001. LNCS, vol. 2068, pp. 267–283. Springer,
Heidelberg (2001). https://doi.org/10.1007/3-540-45341-5_18
31. Kohlborn, T., Korthaus, A., Chan, T., Rosemann, M.: Identification and analysis of business
and software services-a consolidated approach. IEEE Trans. Serv. Comput. 2(1), 50–64 (2009)
32. Andersson, B., Johannesson, P., Zdravkovic, J.: Aligning goals and services through goal and
business modelling. Inf. Syst. E-bus. Manag. 7(2), 143–169 (2009)
33. Ramel, S., Grandry, E., Dubois, E.: Towards a design method supporting the alignment
between business and services software. In: Proceedings - International Computer Software
and Applications Conference, vol. 1, pp. 349–354 (2009)
34. Lo, A., Yu, E.: From business models to service-oriented design: a reference catalog approach.
In: Parent, C., Schewe, K.D., Storey, V.C., Thalheim, B. (eds.) ER 2007, vol. 4801, pp. 87–101.
Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75563-0_8
A Multi-concern Method for Identifying Business Services 241
35. Grandry, E., Dubois, E., Picard, M., Rifaut, A.: Managing the alignment between business and
software services requirements from a capability model perspective. In: Mähönen, P., Pohl,
K., Priol, T. (eds.) ServiceWave 2008. LNCS, vol. 5377, pp. 171–182. Springer, Heidelberg
(2008). https://doi.org/10.1007/978-3-540-89897-9_15
36. Bianchini, D., Cappiello, C., De Antonellis, V., Pernici, B.: P2S: a methodology to enable inter-
organizational process design through web services. In: van Eck, P., Gordijn, J., Wieringa,
R. (eds.) CAiSE 2009. LNCS, vol. 5565. LNCS, pp. 334–348. Springer, Heidelberg (2009).
https://doi.org/10.1007/978-3-642-02144-2_28
37. Jamshidi, P., Sharifi, M., Mansour, S.: To establish enterprise service model from enter-
prise business model. In: Proceedings of 2008 IEEE International Conference on Services
Computing, SCC 2008, vol. 1, pp. 93–100 (2008)
38. Kaabi, R.S., Souveyet, C., Rolland, C.: Eliciting service composition in a goal driven manner.
In: Proceedings of the 2nd International Conference on Service Oriented Computing - ICSOC
2004, p. 308 (2004)
39. Kim, Y., Doh, K.: The service modeling process based on use case refactoring. In: Abramow-
icz, W. (eds.) BIS 2007. LNCS, vol. 4439, pp. 108–120. Springer, Heidelberg (2007). https://
doi.org/10.1007/978-3-540-72035-5_9
40. Klose, K., Knackstedt, R., Beverungen, D.: Identification of services - a stakeholder-based
approach to SOA development and its application in the area of production planning. In:
ECIS, no. 2007, pp. 1802–1814 (2007)
41. Kohlmann, F., Alt, R.: Business-driven service modeling - a methodological approach from
the finance industry. In: Sabre 2007, pp. 1–14 (2007)
42. Lee, J., Muthig, D., Naab, M.: An approach for developing service oriented product lines. In:
2008 12th International Software Product Line Conference, pp. 275–284 (2008)
43. Lee, J., Sugumaran, V., Park, S., Sansi, D.: An approach for service identification using value
co-creation and IT convergence. In: Proceedings of 1st ACIS/JNU International Conference
on Computers, Networks, Systems, and Industrial Engineering, CNSI 2011, pp. 441–446
(2011)
44. Suntae, K., Minseong, K., Sooyong, P.: Service identification using goal and scenario in
service oriented architecture. Neonatal. Paediatr. Child Heal. Nurs. 419–426 (2008)
45. Si, H., Ni, Y., Yu, L., Chen, Z.: A service-oriented analysis and modeling using use case app-
roach. In: Proceedings of 2009 International Conference Computational Intelligent Software
Engineering, CiSE 2009, no. 60773163 (2009)
46. Wang, Z., Xu, X., Zhan, D.: Normal forms and normalized design method for business service.
In: IEEE International Conference on e-Business Engineering (ICEBE 2005), pp. 79–86
(2005)
47. Yu, E.: Modelling strategic relationships for process reengineering (1995)
48. Offerman, T., Stettina, C.J., Plaat, A.: Business capabilities: a systematic literature review
and a research agenda. In: 2017 International Conference on Engineering, Technology and
Innovation (ICE/ITMC) (2017)
49. Michell, V.: A Focused Approach to Business Capability, no. Bmsd, pp. 105–113 (2013)
50. Grefen, P., Turetken, O., Traganos, K., den Hollander, A., Eshuis, R.: Creating agility in
traffic management by collaborative service-dominant business engineering. In: Camarinha-
Matos, L., Bénaben, F., Picard, W. (eds.) PRO-VE 2015, IFIP Advances in Information and
Communication Technology, vol. 463, pp. 100–109. Springer, Heidelberg (2015). https://doi.
org/10.1007/978-3-319-24141-8_9
51. Peffers, K., Rothenberger, M., Tuunanen, T., Vaezi, R.: Design science research evaluation.
In: Design Science Research in Information Systems. Advances in Theory and Practice,
pp. 398–410 (2012)
Modeling Complex Business
Environments for Context Aware Systems
1 Introduction
Advances in Information Technology (IT) have changed the nature of business
environments. Today, business entities exist and work in an environment where
they are interdependent and co-create value [1]. Some characteristics of such a
business environment are, intricate value exchanges, a dynamic market, innova-
tive products/services and loose customer loyalties.
By leveraging the advances in IT and computing capabilities, business enti-
ties adapt and customize their products/services to better align with customers’
needs. In other words, they strive to provide context-aware products/services.
Providing context aware services is no trivial task, particularly in today’s busi-
ness environments. Consequently, “How to offer contextual capabilities in com-
plex business environments?”, is one of the four major challenges for Information
Systems researchers highlighted by Kadiri et al. in [2]. In the same article, as a
suggestion for future research, Kadiri et al. noted that “Context models for com-
plex enterprise applications.... are not being addressed in the research community
so far ”. Similar observations have been made by Hong et al. in [3].
Contextual data is necessary to provide context aware products/services.
However, the extraction of (relevant) contextual data in complex business envi-
ronments require proper understanding of the environment [4]. In this paper, we
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 242–256, 2020.
https://doi.org/10.1007/978-3-030-49418-6_16
Modeling Complex Business Environments for Context Aware Systems 243
2 Background
Context aware systems gather and analyze (relevant) contextual data, thereby
aiding a business entity to provide context aware services [4]. Using such systems
the business entity can adapt, modify, update or even change its products/ser-
vices & underlying processes. Context aware systems are being developed and
used in various application domains including; health-care [5], disaster manage-
ment [6] and smart cities [7].
Fig. 1. Different business environments along with context interpretation and response
computation in each of them. Text in blue indicates the focus of this paper. (Color figure
online)
3 Method
This section presents our method for modeling complex business environments.
Using our method, situation of interest for a context aware system can be inter-
preted and modeled. The method has three sequential phases, i.e., Motivation
Modeling, Use-case Modeling and Situation Modeling. With each phase the level
of detail about the system’s context increases, thereby enabling its interpreta-
tion. We have designed the method based on our experiences in different research
projects aimed at designing context aware systems.
Modeling Complex Business Environments for Context Aware Systems 245
Phase I, motivation modeling, provides a high level view of the current and
desired scenario. Modeling them clarifies the motivation behind providing con-
text aware services. The output of Phase I is a motivation model, linking the
current scenario, to-be-provided services and the desired scenario.
Besides obvious value exchanges like money, products and services, business
entities in a business environment also exchange intangible assets, e.g., knowl-
edge, data, expertise and information. To correctly model all value exchanges
in a complex business environment, a formal yet simple modeling technique is
needed. Previous research, e.g. [1], shows that e-3 value modelling is an useful
modeling tool for that purpose. It enables identification, modeling and analysis
of all value exchanges between business entities in a complex business environ-
ment [1]. An e-3 value model consists of actors, value offerings and value transfers
(Table 1).
In Phase II, use-case modeling, results from Phase I are used to create a domain
model and use case diagrams. A domain model includes both physical and
abstract objects, thereby facilitating a clearer understanding of the context. It is
derived from the motivation model and thus restricts domain modeling to rele-
vant entities only. The use case diagrams build upon the domain model and the
motivation model to further dissect (new) services. They provide the designer a
concrete view of the interactions between the user, the system and the context.
In Phase III, situation modeling, results from Phase II are used to create situa-
tion models. Here, a situation refers to a state of the environment in which the
user uses the context aware service.
• Activity 6: Using the use case diagrams write the situations in which the user,
uses the services provided by the system.
Rationale: Documenting situations of interest.
• Activity 7: If Activity 6 produces new entities, add them to the domain model.
Rationale: A more comprehensive representation of the context. All entities
in the domain model can together represent all situation in which the user
uses the services.
Modeling Complex Business Environments for Context Aware Systems 247
• Activity 8: Based on the domain model, formalize the situations from Activity
6. Represent them using pseudo code or graphical situation models [13].
Rationale: To give a formal representation of the situation of interest in the
business environment. Pseudo code or situation models are then converted to
executable code.
At the end of Phase III, the designer has modeled situations of interest in
the business environment. They are represented in terms of entities and their
properties. It is these situations which a context aware systems should detect and
interpret. By employing the above steps the system designer has a methodical
approach for modeling a complex business environment. The following section
illustrates the method using the case of Janssen Transport b.v., a transport
company desiring dynamic routing of its trucks.
2
further on we will use the term Hotels to refer to all collection points for laundry.
248 P. M. Singh et al.
2. Late arrival of trucks at the hotels, i.e., beyond the time window. Delays
are primarily due to traffic conditions. Due to delays, the same hotels, those
visited last in each route, get affected. In the desired scenario JT wants to
adopt a pro-active approach of routing its trucks based on historical travel
time data and real time traffic data.
3. While some trucks, at the end of their route, return to FC Cleaning Unit half
empty (unused free capacity); others offload laundry and have to resume on
their routes. In the desired scenario JT wants to predict the daily demands
of each hotel and optimize the truck routes. Demand prediction will be based
on historic demand data.
To address above concerns, JT requires a context aware routing system. The
system should dynamically route trucks, optimize truck capacities and minimize
delayed arrival of trucks.
Fig. 4. Motivation model using ArchiMate 3.0 motivation elements (assessment, drivers
and goals)
the relevance and effect of each problem/concern. (c) Goal elements model
the new/improved service provided in the desired scenario. Each goal caters
to at least one driver or a lower level goal.
Context
Actual
Aware
Truck Driver Time of
Routing
Arrival
System
Total
Delay(Truck) Truck Hotel Laundry
(Load)
Provide
Location
include
Log In
include
Enter As-
signed Sector
Provide
Dynamic
Request include Routing
Next Stop
include
Truck Driver Load Opti-
mized Routing
Enter Actual
Time of
Mark include Arrival
Hotel as
Visited include
Enter Picked
Up Laundry
• Situation 3. A truck can not arrive at hotel(s) within the time window. It
should be possible for another truck to pick-up laundry from such hotel(s).
In Fig. 8 (top) Truck 2 is en-route between Hotel E and Hotel F. The road
has traffic congestion, thus traffic movement is slow. Though Truck 2 can
reach Hotel F on time it will not be able to do so for Hotel G. In the
current scenario, dirty laundry from Hotel G would be left uncollected.
In the desired scenario another truck picks up dirty laundry from Hotel
G, as shown in Fig. 8 (bottom).
• Activity 7: After describing Situations 1–3 in textual form, new entities were
discovered and have been added in the domain model (Fig. 9).
• Activity 8: The three situations from Activity 6 are formalized (pseudo code)
using entities from the updated domain model.
• Situation 1. At time t, elements of the set RemainingHotelsT ruck (t){}
are all hotels still to be visited by a truck. It is equal to all hotels in
the sector assigned to the truck minus V isitedHotels. Thus, when a
truck, say Truck B, leaves JT in the morning, RemainingHotelsT ruckB {}
252 P. M. Singh et al.
Fig. 7. Situation 2, when a truck is full en-route a different truck visits the remaining
hotels
Fig. 8. Situation 3, when a truck cannot arrive at a hotel on time, a different truck
visits the remaining hotels
For calculation of ETA, traffic events and associated delay related to each
event is needed.
5 Discussion
JT’s business environment is a more complex and more complicated business
environment (Category IV, Fig. 1). The identification of situations, is based on
historic data (e.g., travel times, pick up load) and real time data (e.g., traffic,
truck capacity). However, such an identification cannot be done with certainty as
travel times and pick up load are stochastic, not deterministic. The context aware
routing system would constantly monitor the context for detecting the situations.
Furthermore, the situations are difficult to identify via manual monitoring of the
context, thereby making it a complex environment. The following points make
the desired scenario more complicated, i.e., the computation of the correct system
response, difficult:
1. Computation of the fastest route for each truck, using historic travel time
data and real time traffic information.
2. Calculation of the monetary loss owing to unpicked dirty laundry from a
hotel.
3. The ETA of a truck at a hotel can not be determined with certainty.
4. Selection of a truck among all trucks, such that extra costs associated with
visiting additional hotels (e.g. fuel, driver compensation, etc.) is less than the
loss incurred by unpicked laundry.
In Sect. 1, Introduction, we used the article by Kadiri et al. [2] to motivate
our research. In the same article the authors had presented a generic model
showcasing four layers of a context processing life cycle, i.e., Acquisition, Mod-
eling, Processing and Dissemination. Table 2 compares these layers vis-à-vis the
phases in our method. Phase III maps to the Processing Layer partially, since
only Activity 8 delivers codes and situation models suitable for processing by
development tools. Our method doesn’t prescribe how situations of interest be
communicated to the actors/business entities in the context. Consequently, there
is no apparent mapping between the Dissemination layer and a phase in our
method. Further study is needed to assess the above comparison and it is an
area for future research. Additionally, future researchers should investigate the
domain specific modifications needed for the method.
Modeling Complex Business Environments for Context Aware Systems 255
Context
Delay Traffic Aware
ETA
(Events) Events Routing
System
Route Actual
Truck Driver Time of Predicted
Sector Load
Arrival
Time
Delay(Truck) Truck Hotel
Window
Location
Free Capacity Picked Total
up Laundry
Laundry (Load)
Visited Total
Hotels Capacity
Our method will enable system designers to avoid two pitfalls during context
aware system design; (a) considering significant amount of contextual informa-
tion during the initial design steps which later proves to be less relevant and (b)
not including relevant contextual data in system design. Using our method, the
designer is not overwhelmed with the complexity of business environments, thus
avoiding the above pitfalls.
The proposed method is based on our experiences in system design and
development. We did not follow a specific design methodology for designing
256 P. M. Singh et al.
the method. This a limitation of our research as design choices made by us may
not be explicit.
6 Conclusion
Complex business environments pose unique challenges for the design of con-
text aware systems. In this paper we presented a step-wise method to model
a complex business environment. Via our method we, (a) provide a direction
for investigating the context and (b) highlight the use of investigation results
in subsequent design steps. System designers will find our method useful for
the design of context aware systems in complex business environments. Future
researchers should apply the method in varied application domains which would
lead to further improvement of the method.
References
1. Wieringa, R., Engelsman, W., Gordijn, J., Ionita, D.: A business ecosystem archi-
tecture modeling framework. Paper presented at the 21st IEEE Conference on
Business Informatics (CBI) 2019, Moscow, Russia (2019)
2. Kadiri, S.E., et al.: Current trends on ICT technologies for enterprise information
systems. Comput. Ind. 79, 14–33 (2016)
3. Hong, J., Suh, E., Kim, S.-J.: Context-aware systems: a literature review and
classification. Expert Syst. Appl. 36(4), 8509–8522 (2009)
4. van Engelenburg, S., Janssen, M., Klievink, B.: Designing context aware systems:
a method for understanding and analysing context in practice. J. Logical Algebraic
Methods Program. 103, 79–104 (2019)
5. Trinugroho, Y.P.D., Reichert, F., Fensli, R.: An ontology enhanced SOA-based
home integration platform for the well being of inhabitants. Paper presented at
the 4th IADIS International Conference on e-Health 2012, Lisbon, Portugal (2012)
6. Fleischer, J., et al.: An integration platform for heterogeneous sensor systems in
GITEWS - Tsunami Service Bus. Nat. Hazards Earth Syst. Sci. 10, 1239–1252
(2010)
7. Auger, A., Exposito, E., Lochin, E.: iQAS: an integration platform for QoI assess-
ment as a service for smart cities. Paper presented at 3rd World Forum on IoT
2016, Reston, USA (2016)
8. Chen, H., Zeng, S., Lin, H., Ma, H.: Munificence, dynamism, and complexity: how
industry context drives corporate sustainability. Bus. Strategy Environ. 25, 125–
141 (2017)
9. Saleh, A., Watson, R.: Business excellence in a volatile, uncertain, complex and
ambiguous environment (BEVUCA). TQM J. 29(5), 705–724 (2017)
10. Vasconcelos, F.C., Ramirez, R.: Complexity in business environments. J. Bus. Res.
64, 236–241 (2011)
11. ArchiMate 3.0.1 Specification. https://pubs.opengroup.org/architecture/
archimate3-doc/chap06.html
12. OMG Business Motivation Model v1.3. https://www.omg.org/spec/BMM/1.3/
PDF
13. Costa, P.D., Mielke, I.T., Pereira, I., Almeida, J.P.A.: A model-driven approach to
situations: situation modeling and rule-based situation detection. Paper presented
at the IEEE 16st International EDOC Conference 2012, Beijing, China (2012)
Towards Automating the Synthesis
of Chatbots for Conversational
Model Query
1 Introduction
Instant messaging platforms have been widely adopted as one of the main tech-
nologies to communicate and exchange information. Most of them provide built-
in support for integrating chatbot applications, which are automated conver-
sational agents capable of interacting with users of the platform [10]. Chat-
bots have proven useful in various contexts to automate tasks and improve the
user experience, such as automated customer services [23], education [9] and
e-commerce [21]. However, despite many platforms have recently emerged for
creating chatbots (e.g., DialogFlow [6], IBM Watson [7], Amazon Lex [1]), their
construction and deployment remains a highly technical task.
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 257–265, 2020.
https://doi.org/10.1007/978-3-030-49418-6_17
258 S. Pérez-Soler et al.
2.1 Motivation
As a motivating example, assume a city hall would like to provide open access
to its real-time traffic information system. Given the growth of the open data
movement, this is a common scenario in many cities, like Barcelona1 or Madrid2 .
We assume that the data provided includes a static part made of the different
districts and their streets, with information on the speed limits. In addition, a
dynamic part updated in real-time decorates the streets and their segments
with traffic intensity values and incidents (road works, street closings, accidents
or bottlenecks). Figure 1 shows a meta-model capturing the structure of the
provided information.
In this scenario, citizens would benefit from user-friendly ways to query those
traffic models. However, instead of relying on the construction of dedicated front-
ends with fixed queries, or on the use of complex model query languages like
1
https://opendata-ajuntament.barcelona.cat/.
2
https://datos.madrid.es.
Towards Automating the Synthesis of Chatbots 259
intensity TrafficIntensity
City 0..1 value: float
districts name: String /serviceLevel: {moving, heavy, jam, closed}
*
OCL, our proposal is the use of conversational queries based on NL via chat-
bots. Chatbots can be used from widely used social networks, like Telegram or
Twitter, facilitating their use by citizens. Hence, citizens would be able to issue
simple queries like “give me all accidents with more than one injury”; and also
conversational queries like “what are the incidents in Castellana Street now?”,
and upon the chatbot reply, focus on a subset of the results with “select those
that are accidents”. Finally, for the case of dynamic models, reactive queries like
“ping me when Castellana Street closes” would be possible.
Our proposal consists in the generation of a dedicated query chatbot given the
domain meta-model. But, before introducing our approach, the next subsection
explains the main concepts involved in chatbot design.
Intents are defined via training phrases. These phrases may include parame-
ters of a certain type (e.g., numbers, days of the week, countries). The parameter
types are called entities. Most platforms come with predefined sets of entities
and permit defining new ones. Some platforms permit structuring the conversa-
tion as an expected flow of intents. For this purpose, a common mechanism is
providing intents with a context that stores information gathered from phrase
parameters, and whose values are required to trigger the intent. In addition,
there is normally the possibility to have a fallback intent, to be used when the
bot does not understand the user input.
3 Approach
Figure 3 shows the scheme «conforms to»
a) Intents
name descripƟon training phrases provided context required context
loadModel loads working model from the load the model {MODEL} MODEL type text
-
backend open model {MODEL}…
allInstances returns all instances of a given class give me all the {CLASSNAME} CLASSNAME type Class
MODEL
show me the {CLASSNAME}…
filtered returns all instances of a given class select the {CLASSNAME} with {FILTER1} CLASSNAME type Class
AllInstances and sa sfying a condi on display the {CLASSNAME} with {FILTER1} FILTER1 and FILTER2 type Condi on MODEL
{CONJ} {FILTER2}… CONJ type Conjunc on
Fig. 4. Intents and entities generated for the running example chatbot.
connectives. We provide an entity Condition for the filters, explained below. This
intent would be selected upon receiving phrases like “give me all accidents with
more than one injury” (please note the singular variation w.r.t. the attribute
name injuries).
In addition to intents, we create several entities based on the domain meta-
model and the NL configuration. Specifically, we create an entity named Class
(Table(b)) with an entry for each meta-model class name. These entries may
have synonyms, as provided by the NL configuration, to refer to the classes
in a more flexible way. Likewise, we create an entity for each attribute name
attending to their type: String (Table(c)), Numeric (Table(d)), Boolean and Date
(omitted for space constraints). For example, the StringAttribute entity (Table(c))
has an entry for all String attributes called name. Just like classes, these entries
may have synonyms if provided in the NL configuration.
The Condition entity (Table(f)) is a composite one, i.e., its entries are made
of one or more entities. This entity permits defining filter conditions in queries,
such as “name starts with Ma” or “injuries greater than one”.
Regarding the complexity of the chatbot, the number of intents is fixed,
and it depends on the primitives of the underlying query language that the
chatbot exposes. Figure 4 exposes two primitives of OCL: allInstances, and
allInstances()→select(cond). Other query types can be added similarly, which would
require defining further intents. The number of generated entities is also fixed,
while the number of entries in each entity depends on the meta-model size and
the synonyms defined in the NL configuration.
contains a set of execution rules that bind user intentions to response actions as
part of the chatbot behaviour definition (cf. label 4 in Fig. 2). For each intent in
the Intent model, we generate the corresponding execution rule in the execution
model using an event-based language that receives as input the recognized intent
together with the set of parameter values matched by the NL engine during the
analysis and classification of the user utterance.
All the execution rules follow the same process: the matched intent and the
parameters are used to build an OCL-like query to collect the set of objects
the user wants to retrieve. The intent determines the type of query to perform
(e.g., allInstances, select, etc.), while the parameters identify the query parame-
ters, predicates, and their composition. The query computation is delegated to
the underlying modelling platform (see next section), and the returned model
elements are processed to build a human-readable message that is finally posted
to the user by the bot engine.
As an example, Listing 1 shows the execution rule that handles an allInstances
operation. The class to obtain the instances of is retrieved from the context
variable (available in every execution rule) and passed to our EMF Platform, which
performs the query. Next, the instances variable holding the results is processed
to produce a readable string (in this case a list of names), and the Chat Platform
is called to reply to the user.
1 on intent GetAllInstances do
2 val Map<String, Object> collectionContext = context.get(”collection”)
3 val instances = EMFPlatform.GetAllInstances( collectionContext.get(”class”) as String )
4 val resultString = instances.map[name].join(”, ”)
5 ChatPlatform.Reply(”I found the following results” + resultString)
4 Proof of Concept
As a proof of concept, we have created a prototype that produces Xatkit-based
chatbots [4], following the two phases depicted in Fig. 3. Xatkit is a model-
driven solution to define and execute chatbots, which offers DSLs to define the
bot intents, entities and actions. The execution of such chatbots relies on the
Xatkit runtime engine. At its core, the engine is a Java library that implements
all the execution logic available in the chatbot DSLs. Besides, a connector with
Google’s DialogFlow engine [6] takes care of matching the user utterances, and a
number of platform components enable the communication between Xatkit and
other external services.
In the context of this paper, we have developed a new EMF Platform that
allows Xatkit to query EMF models in response to matched intents. The first
version of our prototype platform3 provides actions to retrieve all the instances
of a given class, and filter them based on a composition of boolean predi-
cates on the object’s attributes or references. These predicates are retrieved
from the context parameter defined in the intents (see Sect. 3.1), and mapped
3
https://github.com/xatkit-bot-platform/xatkit-emf-platform.
Towards Automating the Synthesis of Chatbots 263
Fig. 5. (a) Web application to configure the chatbot. (b) A query in the generated
chatbot.
5 Related Work
Next, we review approaches to the synthesis of chatbots for modelling or data
query.
Our work relies on NL as a kind of concrete syntax for DSLs [17]. NLP has
been used within Software Engineering to derive UML diagrams/domain models
from text [2,11]. However, the opposite direction (i.e., generating chatbots from
domain models) is largely unexplored. Almost no chatbot platform supports
automatic chatbot generation from external data sources. A relevant exception
is Microsoft QnA Maker [14], which generates bots for the Azure platform from
FAQs and other well-structured textual information.
264 S. Pérez-Soler et al.
Closest approaches to ours are tools like ModelByVoice [13] and VoiceTo-
Model [20], which offer some predefined commands to create model elements
for specific types of models. In contrast, our framework targets model queries
and not model creation, which was pursued in our previous work [17]. None of
those two approaches support queries. Castaldo and collaborators [3] propose
generating chatbots for data exploration in relational databases, but requiring
an annotated schema as starting point, while in our case providing synonyms is
an optional step. Similarly, [19] integrates chatbots to service systems by anno-
tating and linking the chatbot definition to the service models. In both cases,
annotations and links must be manually created by the chatbot designer to gen-
erate the conversational elements. In contrast, our approach is fully automatic.
In [22], chatbots are generated from OpenAPI specifications but the goal of such
chatbots is helping the user in identifying the right API Endpoint, not answering
user queries.
Altogether, to our knowledge there are no automatic approaches to the gener-
ation of flexible chatbots with model query capabilities. We believe that applying
classical concepts from CRUD-like generators to the chatbot domain is a highly
novel solution to add a conversational interface to any modelling language.
6 Conclusion
Conversational interfaces are becoming increasingly popular to access all kind
of services, but their construction is challenging. To remedy this situation, we
have proposed the automatic synthesis of chatbots able to query the instances
of a domain meta-model.
In the future, we aim to support more complex queries, including the conver-
sational and reactive ones mentioned in Sect. 2.1. Our approach could be used to
query other types of data sources (e.g., databases or APIs) via an initial reverse
engineering step to build their internal data model and translate the NL query
into the query language of the platform. Finally, we would like to add access
control on top of the bot definition to ensure users cannot explore parts of the
model/system unless they have permission.
References
1. Amazon: Amazon Lex (2019). https://aws.amazon.com/lex/
2. Arora, C., Sabetzadeh, M., Briand, L.C., Zimmer, F.: Extracting domain mod-
els from natural-language requirements: approach and industrial evaluation. In:
Proceedings of the MoDELS, pp. 250–260. ACM (2016)
3. Castaldo, N., Daniel, F., Matera, M., Zaccaria, V.: Conversational data explo-
ration. In: Bakaev, M., Frasincar, F., Ko, I.-Y. (eds.) ICWE 2019. LNCS, vol.
11496, pp. 490–497. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-
19274-7 34
Towards Automating the Synthesis of Chatbots 265
4. Daniel, G., Cabot, J., Deruelle, L., Derras, M.: Xatkit: a multimodal low-code
chatbot development framework. IEEE Access 8, 15332–15346 (2020)
5. Erlenhov, L., de Oliveira Neto, F.G., Scandariato, R., Leitner, P.: Current and
future bots in software development. In: Proceedings of the BotSE@ICSE, pp.
7–11. IEEE/ACM (2019)
6. Google: DialogFlow (2019). https://dialogflow.com/
7. IBM Watson Assistant (2019). https://www.ibm.com/cloud/watson-assistant/
8. Jackson, P., Moulinier, I.: Natural Language Processing for Online Applications:
Text Retrieval, Extraction and Categorization, vol. 5. John Benjamins Publishing,
Amsterdam (2007)
9. Kerlyl, A., Hall, P., Bull, S.: Bringing chatbots into education: towards natural
language negotiation of open learner models. In: Ellis, R., Allen, T., Tuson, A.
(eds.) SGAI 2006, pp. 179–192. Springer, London (2007). https://doi.org/10.1007/
978-1-84628-666-7 14
10. Klopfenstein, L., Delpriori, S., Malatini, S., Bogliolo, A.: The rise of bots: a survey
of conversational interfaces, patterns, and paradigms. In: Proceedings of the DIS,
pp. 555–565. ACM (2017)
11. Landhäußer, M., Körner, S.J., Tichy, W.F.: From requirements to UML models
and back: how automatic processing of text can support requirements engineering.
Softw. Qual. J. 22(1), 121–149 (2014)
12. Lebeuf, C., Storey, M.D., Zagalsky, A.: Software bots. IEEE Softw. 35(1), 18–23
(2018)
13. Lopes, J., Cambeiro, J., Amaral, V.: ModelByVoice - towards a general purpose
model editor for blind people. In: Proceedings of the MODELS Workshops. CEUR
Workshop Proceedings, vol. 2245, pp. 762–769. CEUR-WS.org (2018)
14. Microsoft: QnA Maker (2019). https://www.qnamaker.ai/
15. OCL (2014). http://www.omg.org/spec/OCL/
16. Pérez-Soler, S., Guerra, E., de Lara, J.: Collaborative modeling and group decision
making using chatbots in social networks. IEEE Softw. 35(6), 48–54 (2018)
17. Pérez-Soler, S., González-Jiménez, M., Guerra, E., de Lara, J.: Towards conver-
sational syntax for domain-specific languages using chatbots. JOT 18(2), 5:1–21
(2019)
18. Shawar, A., Atwell, E., Roberts, A.: FAQchat as in information retrieval system.
In: Proceedings of the LTC, pp. 274–278. Wydawnictwo Poznańskie, Poznań (2005)
19. Sindhgatta, R., Barros, A., Nili, A.: Modeling conversational agents for service sys-
tems. In: Panetto, H., Debruyne, C., Hepp, M., Lewis, D., Ardagna, C.A., Meers-
man, R. (eds.) OTM 2019. LNCS, vol. 11877, pp. 552–560. Springer, Cham (2019).
https://doi.org/10.1007/978-3-030-33246-4 34
20. Soares, F., Araújo, J., Wanderley, F.: VoiceToModel: an approach to generate
requirements models from speech recognition mechanisms. In: Proceedings of the
SAC, pp. 1350–1357. ACM (2015)
21. Thomas, N.: An e-business chatbot using AIML and LSA. In: Proceedings of the
ICACCI, pp. 2740–2742. IEEE (2016)
22. Vaziri, M., Mandel, L., Shinnar, A., Siméon, J., Hirzel, M.: Generating chat bots
from web API specifications. In: Proceedings of the ACM SIGPLAN Onward!, pp.
44–57 (2017)
23. Xu, A., Liu, Z., Guo, Y., Sinha, V., Akkiraju, R.: A new chatbot for customer
service on social media. In: Proceedings of the CHI, pp. 3506–3510. ACM (2017)
Enterprise and Business Modeling
(EMMSAD 2020)
Conceptualizing Capability Change
1 Introduction
Organizations are dynamic systems, constantly being in a state of change and evolution
[1]. This state is driven by the dynamism existing in the organization’s environment,
both internally and externally. In the face of environmental opportunities and threats,
organizations need to change to improve their effectiveness at achieving their goals
[2], or ensure survival [3]. The changes occurring in an organization’s environment
are characterized by speed and direction which are often difficult to anticipate [3]. In
addition, the environment’s pace of change is higher that the organization’s [4], and the
speed is further increased by factors like the digital transformation of the society [5] and
emerging technologies and strategies [3]. The concepts of change and strategy are not
only linked to each other, but also to the concept of capability [6].
The notion of capability bears significance because it depicts an organizational view-
point that encompasses several notions significant to organizational change. For example,
goal, decision, context, process and service [7, 8] have been used, especially in the man-
agement literature, not only to describe an organization’s value-generating elements, but
have also been used as the core concepts of Enterprise Modeling (EM) approaches [8, 9].
EM, as a discipline, captures relevant knowledge and provides motivation and input
for designing Information Systems (IS) to support the organization [10]. ISs are signif-
icant for every organization since they help in simplifying the organization’s activities
and processes and have gradually become integrated with almost every aspect of the
© Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 269–283, 2020.
https://doi.org/10.1007/978-3-030-49418-6_18
270 G. Koutsopoulos et al.
business [11], to the level where business and IT have been “fused” into one [5]. This
integration has raised several challenges for EM, especially regarding the organizations
that are in motion, changing and evolving. Due to a high rate of change in modern enter-
prises, the maintenance of models that are sufficiently capturing the architecture from
the perspective of involved stakeholders does not seem to be feasible. One of the main
challenges for EM is how to capture the motion of an organization, that is, its current and
desired affairs [5]. Capability modeling, as a specialization of EM, also needs to tackle
this challenge. This can be achieved by optimizing existing approaches or developing
specific modeling approaches for depicting capability change.
The objective of this study is to propose a meta-model for depicting capability
change. It belongs to a research project that aims to provide methodological and tool
support for organizations that are undergoing changes or need to. The project is elab-
orated following the principles of Design Science Research (DSR) [12, 13]. Following
the exploration of the field [14], the elicitation of requirements for the artifact [15], and
the introduction of a typology for changing capabilities [16], the present study belongs
to the project step that concerns the initial development of a meta-model. This is a
design artifact that will serve as a basis for a capability modeling method. In addition,
the meta-model is demonstrated by applying it to an existing case, in particular, a public
healthcare organization in Sweden which is undergoing changes.
The rest of the paper is structured as follows. Section 2 provides a brief presentation
of the background and research related to this study. Section 3 describes the methods
employed in this study. Section 4 introduces and describes the capability change meta-
model and its components. Section 5 presents an example application of the meta-model
on a case study. Section 6 discusses the meta-model and its application. Section 7 provides
concluding remarks.
This section presents a brief overview of the existing capability modeling research and
the topics relevant to the development of the meta-model.
Organizations are social goal-directed systems which maintain boundaries that reflect
their goals [3]. Changing organizations have been widely researched. There are several
terms describing the phenomenon with different terms, for example change, transforma-
tion, adaptation [3]. These terms are sometimes used interchangeably or used to reflect
different scopes of undergoing changes [17]. The terms business, organization and enter-
prise are often used interchangeably as well, however, there are also cases where they
are distinguished, like for example [1], where an enterprise is defined as a collection of
organizations that share a common set of goals.
Regarding the drivers of organizational change, there are several perspectives and
associated theories. Zimmermann [3] has provided a detailed analysis of these perspec-
tives. One of the main perspectives is based on the assumption of human rationality and
utility maximization, which results in assuming that entire organizations are rationally
Conceptualizing Capability Change 271
adapting to the environment [3]. These theories that consider the environment as the
factor setting the point of time and the direction of change are called deterministic, in
comparison to the voluntaristic theories that build on the importance of the consideration
of strategic choice. This perspective emphasizes on strategic choice of the organization’s
decision-makers and their role in shaping the organization [3].
However, there are also theories that reconcile these perspectives to facilitate under-
standing of change as a combination of environmental and managerial forces taking also
organizational inertia into consideration. For example, the cognitive approach aims to
understand the processes of an organization that lead both to prosperity and decline, and
also to failure to change. This is preceded by the definition of cognition as the process
that involves the perception and interpretation of the environment and the translation of
this information into strategic choice [3]. Including the negative aspect of change is in
line with our earlier work [16].
A noteworthy point is that the diverse drivers of change do not provide any indi-
cation between causes and consequences. The causality of change and the causal rela-
tionships among the factors driving change, which have often been neglected, should be
implemented in research methods aiming to capture the complexity of change [3].
The process of creating a model capturing all the aspects of an enterprise that a modeling
purpose requires, is called EM. Thus, the produced model consists of interconnected
models, each of them being focused on one specific viewpoint of the modeled enterprise,
for example, processes, goals, concepts and business rules [18]. Any organization or its
part can benefit from the application of EM.
An enterprise model can help people in an organization to develop a deeper under-
standing of the system, in other words, how their work gets integrated in a bigger pic-
ture and, additionally, models can enable the users’ understanding of the supporting
information systems and its interplay with organizational action patterns [19].
Furthermore, since the meta-models specify modeling languages, they are valuable
to (i) modelers, who are interested in understanding and applying the language, (ii)
researchers, who have interest in evaluating and adapting a language, for example to a
domain-specific version, and (iii) tool vendors, who have interest in developing tool for
the language [20].
2.3 Capabilities
There is no consensus on the definition of capability in the literature. In this study, the
notion of capability is defined as a set of resources, whose configuration bears the ability
and capacity to create value by fulfilling a specific goal within a specific context. This
definition is the result of combining two earlier definitions from [8, 21].
The concept of capability is often considered as the missing link in business/IT trans-
formation [22]. Its growing popularity can be attributed to the fact that it enables busi-
ness/IT transformations by (i) providing a common language to the business, (ii) enabling
to-the-point investment focus, (iii) serving as a baseline for strategic planning, change
272 G. Koutsopoulos et al.
management and impact analysis, and (iv) leading directly to business specification and
design [22].
Capability Modeling
The capability modeling approaches that exist in the literature have been identified and
their meta-models have been explored in our earlier work. In particular, 64 capability
meta-models have been analyzed using a change function-related framework [14]. The
change functions of the framework are observation, decision and delivery of capability
change. It has been identified that the majority of the meta-models include concepts that
address at least partially, all the above mentioned functions and have a scope combining
business and IT. A set of change related-concepts has been elicited for inclusion in a
capability change meta-model, so as to facilitate the development of a method.
Regarding the modeling of capabilities, [23] have suggested three strategies within
the Capability-Driven Development (CDD) method. All three strategies consist of three
steps, which are (i) Capability design, (ii) Capability evaluation, and (iii) Development
of Capability delivery application. Steps two and three are common in all strategies. The
second step concerns the evaluation of the design from both business and technical per-
spectives before the implementation of the capability. The third step involves packaging
the indicators for monitoring and the algorithms for run-time adjustments as a support
application. The differentiation among the three strategies lies only in the first step. It
concerns the design of the capability using as a starting point: (i) goals, (ii) processes,
or (iii) concepts [23].
3 Methodology
This section presents the methods employed for the development of the meta-model and
the case study.
The manner in which these have driven the development of the meta-model is dis-
cussed below. In addition, the dimensions of capability and change have been researched
and this resulted in introducing a state-based capability typology [16], where the possible
states of capability change have been presented as a UML State Machine diagram. This
earlier work serves as input for the development of the meta-model.
Finally, the constraints in [27] have been taken into consideration, which suggest that
(i) the meta-model should be minimal, which means that no elements that are not moti-
vated by the elicited information needs should be included, (ii) the design rationale for
the included elements should be recorded and (iii) the semantics of the included elements
should be clarified to avoid possible misunderstandings among different stakeholders.
For the analysis, the experts were asked to identify the potential impact of the change,
which means that an experiential approach [28] was applied.
The meta-model and its component elements are presented in this section.
Conceptualizing Capability Change 275
Resource (Goal 21, 23, 24, 25): Employing resources and capabilities analysis
explains how resources can deliver added value [30]. The concept includes capital,
infrastructures, human resources etc. Apart from being one of the most popular con-
cepts in capability meta-models, the concept has also been identified as an important
factor for two of the change functions. A set of resources is what comprises a capability
configuration and is involved not only in the delivery of change due to reallocations, but
also in the decision function using reallocation and new resources as a means for iden-
tifying new capability alternatives [15]. Resource may belong to one or more owners,
and is allocated to one or more capability configurations.
Organization (Goal 28): The concept refers to any public or private organization or
organizational unit. Any organization can interact with one or more organizations. In this
meta-model, the emphasis is not on the architecture of an organization, therefore, the
organization element only depicts an organization as a capability owner. Additionally,
the organization determines a capability’s internal context.
Owner (Goal 22, 23, 24, 25): This concept determines the ownership of capabilities
and resources. Any number of owners can own any number of capabilities and any
number of resources. In the meta-model, it is used as a generalization of organization.
Interaction type (Goal 28): This association class element describes the interaction
between organization elements, for example, collaboration or outsourcing.
Organizational boundary (Goal 26, 27): The importance of organizational boundary
has been identified in our earlier work [15, 31]. It defines the limits of an organization’s
capabilities. As an association class element in the meta-model, it is determined by the
interaction type, it determines at least one type of interaction and may be regulated by
boundary control.
Boundary control (Goal 27): Initially explored as a modeling element in [31], the
concept concerns any type of control between organizations, so as to regulate the inter-
action. It may refer to any level of control, from an informal agreement, to a detailed
formal contract.
Configuration (Goal 4, 6, 10, 20, 21): The complete set of resources that comprise
the capability along with the behavior elements that deliver it. A capability may have
several different configurations but only one may be active at any given moment in time.
In the meta-model, the actual change is captured as a transition between configurations.
It partially consists of one or more behavior elements and has allocated resources, thus
specifies a capability.
Change (Goal 1, 2, 3, 4): captures the change process as a whole. It has at least one
change type and consists of at least one function. In addition, it is associated to one state.
Change Type (Goal 17, 18, 19): This element may describe change elements. Possible
types of changes are introduction, modification or retirement [15].
Function (Goal 2, 3, 4): The function element refers to the specific change functions
that have been identified in our earlier work [14]. More specifically, one or more functions
comprise change and it is a generalization of observation, decision and delivery.
Observation (Goal 2, 7, 12, 13, 14): The observation function concerns monitoring
a capability by capturing relevant external and internal data. The observation element
Conceptualizing Capability Change 277
in the meta-model is meant to depict the collecting sources of data valuable for eval-
uating a capability’s performance. It is a change function, it consists of one or more
measurements, and leads to one or more decisions.
Measurement (Goal 15, 16): It concerns the activity of assessing a factor relevant
to the capability’s performance. The element has a natural association to measurable
indicators like KPIs. It is a part of observation, can be applied to outcomes, is assessing
one or more context factors, and may result in the elaboration of decision criteria.
Outcome (Goal 15, 17, 18, 19): The outcome of a capability realization is used to
provide insight on whether a capability change is required or not. It is the result of one
or more capabilities and may be subjected to one or more measurements.
Decision (Goal 3, 5, 6, 7, 8): The decision activities are related to analyzing context
data to make a decision on capability change, in association to whether an adjustment
or transformation is required and which capability configuration is optimal for the adap-
tation. Therefore, decision is a change function, is determined by at least one criterion,
may be motivated by one or more intention elements, leads to delivery and observation
leads to it.
Intention element (Goal 8): This abstract meta-element includes all the concepts that
refer to the intentions driving the change, i.e. concepts like goal, objective and desire.
An intention element may motivate one or more decisions.
Criterion (Goal 5): Decision criteria provide the standards that will be used in order
to make a decision. A criterion is often formulated through observation of the context.
In the meta-model, one or more criteria determine a decision and may be determined by
one or more measurements of the context.
Delivery (Goal 4, 10): Delivery of change refers to how the decision on change is
applied affecting the way a capability is realized and capabilities’ interrelationships.
Regarding the delivery element in the meta-model, it is a change function, at least one
decision leads to it, and as an association class, describes the transition between capability
configurations.
Behavior element (Goal 4): Another abstract meta-element which is meant to depict
every possible process, service or activity that is involved in the realization of the
capability. A behavior element is part of one or more capability configurations.
State (complies with [16]): The notion of state has been explored in earlier research
in relation to capability and change. The attribute potentiality has two possible values,
that is, enabled or disabled. In the meta-model the state class is a generalization for
capability and change states.
Capability state (complies with [16]): A specialization of the state class, is associated
to one or more capabilities. The purpose attribute reflects if the capability is meant to
fulfil a goal or avoid a problem.
Change state (complies with [16]): A specialization of the state class, this element
includes as attributes the dimensions of change that have been introduced in [16], which
are scope, control, frequency, stride, desire, intention and tempo, specified for every
state, which means that the attributes always need to have a value.
278 G. Koutsopoulos et al.
connecting it with two configuration objects, namely Configuration A and B. The former
is the one enabled before the change and the latter is the one enabled after the change.
This is captured in the Delivery object, which is associated to both Configuration objects
to depict the transition. The delivery of change disables Configuration A and enables
configuration B. The difference between the two configurations is depicted through
their allocated resources. Configuration A is only associated to the Guidance system
and Provider catalogue system, while B is also associated to the Symptom – provider
system. An 1177 Behavior element object is associated to both configurations. It is
interesting to note that the Provider catalogue and Guidance system are associated to the
Private provider and National public provider respectively, so as to represent ownership.
This means that while RH owns the capability, the two collaborating organizations
own involved resources. Especially when it comes to the symptom – provider system,
the object, which is part of Configuration B, is owned by RH along with the System
development capability, on which Health guidance depends, and the Expert physicians
used to develop it. However, the Provider catalogue system and the Provider data are
owned by Private provider, so the Collaboration Interaction type object is not enough to
represent the case. A Data availability Organizational boundary and a Contract Boundary
control object complement the required information for this part of the diagram.
6 Discussion
This study is a part of an ongoing work, hence, the development of the presented meta-
model artifact will evolve in the following design-evaluation cycles. DSR guidelines are
also stating that artifact development is almost always an iterative step [13]. What an
iterative process implies is the identification of weaknesses and strengths in the artifact,
which can be translated to opportunities for improvement. Since a meta-model is an
essential method component, the current version of the meta-model contributes towards
establishing the foundation for the development of a method specially designed for
managing capability change, by (i) capturing the relevant information according to the
elicited goals and (ii) decomposing change into the previously elicited functions that it
consists of, and, (iii) depicting the transition during change implementation.
Regarding the efficiency of the meta-model in capturing the needed information and
depicting the aspects of capability change, the demonstration using the RH case suggests
that all the required factors have been taken into consideration and the information
structure seems adequate for the particular case. The set of goals elicited in our previous
work have been fulfilled, even though the case was not optimal for details in goals like
the elicitation of context, because to a certain degree, the elicitation had already been
performed by the stakeholders and interviewees, in terms of recognizing the political,
employee and partner influence on the Health guidance capability. Therefore, any future
application of the meta-model should favor the selection of a case, where the need and
reasons for change are not obvious, so that the meta-model’s possible deficiencies will
be indicated.
The main concern that has risen from the application in this study is the complexity
of the produced models, which is a result of the meta-model’s structure, and seems
to produce visually cluttered models. During the instantiation of the meta-model to an
Conceptualizing Capability Change 281
object diagram, the complexity of the object diagram exceeded our initial expectations
and what can intuitively be seen as practical for the purpose of communicating with
domain experts. For this reason, several pieces of information have been omitted in
order to reduce the complexity of the result, in terms of a less cluttered model. For
example, many resources involved in the configurations of the capability are missing
because their role in the undergoing change was deemed of lower priority. For example,
the specialized nurses that perform the guidance, the telephone system and the journal
system are not affecting the change project. Yet, omitting pieces of information is not
feasible in every case, and this should be addressed in the meta-model.
During the development of the meta-model the decision to develop a single unified
meta-model that would encompass all aspects of change already indicated a high level of
complexity. For this reason, certain aspects of capability change were slightly neglected,
which means that a higher level of abstraction was applied, a fact that resulted in more
generic meta-elements, for example, Intention element and Behavior element. However,
it is not a coincidence, that all the generic meta-elements reflect aspects that could
be decomposed into entire models. For example, decomposing Intention, Behavior or
Context elements may require the integration of a goal, process [18] or context model
[29] respectively. Among the elements that were included in the meta-model, the point
of emphasis that stands out as the epicenter of change is the Configuration class and
its recursive association that depicts transitions between configurations. Any further
decomposition of this part will only promote the initial goal, to model capability change.
Nevertheless, decomposing goals and processes may be useful, yet, it is not the main
focus point of this project.
This raises the question of employing the technique of slicing meta-models [20],
which means that the meta-model is split according to specific viewpoints. On one hand,
pre-existing common viewpoints like goals, processes and context can be integrated using
existing compatible approaches, to save significant time and effort. On the other hand,
within this project, new viewpoints can also be elaborated, i.e. capability, observation,
decision, delivery and ownership. This is a possible future step that is worth exploring
before deciding to finalize the artifact.
7 Conclusion
In this study a meta-model has been presented that will act as a basis for the develop-
ment of a method for modeling and analyzing capability change. Being a DSR project,
its nature is iterative, therefore, several iterations are expected before the artifact is final-
ized. Therefore, the introduction of the meta-model was succeeded by an application of
the artifact on a real case for demonstrative reasons. This application provided opportu-
nities for improvement, especially in the area of complexity, including the possibility to
introduce viewpoints in a later version of the model.
The next step in our project is to validate the meta-model via interviews with experi-
enced decision-makers. This activity will provide additional insights towards the final-
ization of the artifact. In parallel, the initial experience of this study will be taken into
consideration, in order to explore the implementation of viewpoints in the next stage of
method development.
282 G. Koutsopoulos et al.
Acknowledgment. We would like to express our gratitude to the employees of RH who took
their time in letting us interview them to identify and describe the presented case.
References
1. Proper, H.A., Winter, R., Aier, S., de Kinderen, S. (eds.): Architectural Coordination
of Enterprise Transformation. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-
69584-6
2. Burnes, B.: Managing Change. Pearson, Harlow, England (2014)
3. Zimmermann, N.: Dynamics of Drivers of Organizational Change. Gabler, Wiesbaden (2011).
https://doi.org/10.1007/978-3-8349-6811-1
4. Burke, W.W.: Organization Change: Theory and Practice. Sage Publications, Thousand Oaks
(2017)
5. van Gils, B., Proper, H.A.: Enterprise modelling in the age of digital transformation. In:
Buchmann, R.A., Karagiannis, D., Kirikova, M. (eds.) PoEM 2018. LNBIP, vol. 335, pp. 257–
273. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02302-7_16
6. Hoverstadt, P., Loh, L.: Patterns of strategy. Routledge, Taylor & Francis Group, London,
New York (2017)
7. Loucopoulos, P., Stratigaki, C., Danesh, M.H., Bravos, G., Anagnostopoulos, D., Dimi-
trakopoulos, G.: Enterprise capability modeling: concepts, method, and application. In: 2015
International Conference on Enterprise Systems (ES), pp. 66–77. IEEE, Basel (2015)
8. Sandkuhl, K., Stirna, J. (eds.): Capability Management in Digital Enterprises. Springer, Cham
(2018). https://doi.org/10.1007/978-3-319-90424-5
9. Loucopoulos, P., Kavakli, E.: Capability oriented enterprise knowledge modeling: the
CODEK approach. In: Karagiannis, D., Mayr, H.C., Mylopoulos, J. (eds.) Domain-Specific
Conceptual Modeling, pp. 197–215. Springer, Cham (2016). https://doi.org/10.1007/978-3-
319-39417-6_9
10. Persson, A., Stirna, J.: An explorative study into the influence of business goals on the prac-
tical use of enterprise modelling methods and tools. In: Harindranath, G. et al. (eds.) New
Perspectives on Information Systems Development, pp. 275–287 Springer, Boston (2002).
https://doi.org/10.1007/978-1-4615-0595-2_22
11. Pearlson, K.E., Saunders, C.S., Galletta, D.F.: Managing and Using Information Systems: a
Strategic Approach. Wiley, Hoboken (2020)
12. Hevner, A., Chatterjee, S.: Design Research in Information Systems. Springer, Boston (2010).
https://doi.org/10.1007/978-1-4419-5653-8
13. Johannesson, P., Perjons, E.: An Introduction to Design Science. Springer, Cham (2014).
https://doi.org/10.1007/978-3-319-10632-8
14. Koutsopoulos, G., Henkel, M., Stirna, J.: Dynamic adaptation of capabilities: exploring
meta-model diversity. In: Reinhartz-Berger, I., Zdravkovic, J., Gulden, J., Schmidt, R. (eds.)
BPMDS/EMMSAD -2019. LNBIP, vol. 352, pp. 181–195. Springer, Cham (2019). https://
doi.org/10.1007/978-3-030-20618-5_13
15. Koutsopoulos, G., Henkel, M., Stirna, J.: Requirements for observing, deciding, and delivering
capability change. In: Gordijn, J., Guédria, W., Proper, H.A. (eds.) PoEM 2019. LNBIP, vol.
369, pp. 20–35. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-35151-9_2
16. Koutsopoulos, G., Henkel, M., Stirna, J.: Modeling the dichotomies of organizational
change: a state-based capability typology. In: Feltus, C., Johannesson, P., Proper, H.A. (eds.)
Proceedings of the PoEM 2019 Forum, pp. 26–39. CEUR-WS.org, Luxembourg (2020)
Conceptualizing Capability Change 283
17. Maes, G., Van Hootegem, G.: Toward a dynamic description of the attributes of organizational
change. In: Research in Organizational Change and Development, pp. 191–231. Emerald
Group Publishing Limited (2011)
18. Sandkuhl, K., Stirna, J., Persson, A., Wißotzki, M.: Enterprise Modeling: Tackling Business
Challenges with the 4EM Method. Springer, Heidelberg (2014). https://doi.org/10.1007/978-
3-662-43725-4
19. Frank, U.: Enterprise modelling: The next steps. Enterp. Model. Inf. Syst. Architect.
(EMISAJ) 9, 22–37 (2014)
20. Bork, D., Karagiannis, D., Pittl, B.: How are metamodels specified in practice? Empirical
insights and recommendations. Presented at the 24th Americas Conference on Information
Systems, AMCIS 2018, New Orleans, LA, USA, 16 August 2018
21. Koutsopoulos, G.: Modeling organizational potentials using the dynamic nature of capabili-
ties. In: Joint Proceedings of the BIR 2018 Short Papers, Workshops and Doctoral Consortium,
pp. 387–398. CEUR-WS.org, Stockholm (2018)
22. Ulrich, W., Rosen, M.: The business capability map: the “Rosetta stone” of business/IT
alignment. Cutter Consort. Enterp. Archit. 14(2) (2011)
23. España, S., et al.: Strategies for capability modelling: analysis based on initial experiences.
In: Persson, A., Stirna, J. (eds.) CAiSE 2015. LNBIP, vol. 215, pp. 40–52. Springer, Cham
(2015). https://doi.org/10.1007/978-3-319-19243-7_4
24. Karagiannis, D., Bork, D., Utz, W.: Metamodels as a conceptual structure: some seman-
tical and syntactical operations. In: Bergener, K., Räckers, M., Stein, A. (eds.) The
Art of Structuring, pp. 75–86. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-
06234-7_8
25. Object Management Group (OMG): OMG® Unified Modeling Language®. https://www.
omg.org/spec/UML/2.5.1/PDF (2017)
26. Bork, D., Karagiannis, D., Pittl, B.: A survey of modeling language specification techniques.
Inf. Syst. 87, 101425 (2020). https://doi.org/10.1016/j.is.2019.101425
27. Kurpjuweit, S., Winter, R.: Viewpoint-based meta model engineering. In: Reichert, M.,
Strecker, S., Turowski, K. (eds.) Enterprise Modelling and Information Systems Architectures:
Concepts and Applications, pp. 143–161. Ges. für Informatik, Bonn (2007)
28. Kilpinen, M.S.: The emergence of change at the systems engineering and software design
interface (2008)
29. Koç, H., Sandkuhl, K.: Context modelling in capability management. In: Sandkuhl, K., Stirna,
J. (eds.) Capability Management in Digital Enterprises, pp. 117–138. Springer, Cham (2018).
https://doi.org/10.1007/978-3-319-90424-5_7
30. Lynch, R.L.: Strategic Management. Pearson Education, Harlow, New York (2018)
31. Henkel, M., Koutsopoulos, G., Bider, I., Perjons, E.: Using the fractal enterprise model for
inter-organizational business processes. In: Joint Proceedings of the ER Forum and Poster &
Demos Session 2019, pp. 56–69. CEUR-WS.org, Salvador (2019)
Supporting Early Phases of Digital Twin
Development with Enterprise Modeling
and Capability Management: Requirements
from Two Industrial Cases
Abstract. Industry 4.0 is a concept that has attracted much research and devel-
opment over the last decade. At its core is the need to connect physical devices
with their digital representations which essentially means establishing a digital
twin. Currently, the technological development of digital twins has gathered much
attention while the organizational and business aspects are less investigated. In
response, the suitability of enterprise modeling and capability management for
the purpose of developing and management of business-driven digital twins has
been analyzed. A number of requirements from literature are summarized and
two industrial cases have been analyzed for the purpose of investigating how the
digital twin initiatives emerge and what forces drive the start of their implementa-
tion projects. The findings are discussed with respect to how Enterprise Modeling
and the Capability-Driven Development method are able to support the business
motivation, design and runtime management of digital twins.
1 Introduction
In manufacturing industries, industry 4.0 and digital transformation are interrelated fields
that both motivate the development of digital twins. Industry 4.0 is a concept attracting
much research and development over the last decade, including reference models [1],
applications [2], standards [3] and supporting methods [4]. A core idea of Industry 4.0 is
to connect physical devices (e.g., manufacturing systems and the objects they produce),
digital components (e.g. ERP or MES systems) and human actors along production
processes for the sake of seamless integration and continuous monitoring and control
[5]. Digital twins (DT) support this core idea and can be defined as “a dynamic virtual
representation of a physical object or system across its lifecycle, using real-time data to
enable understanding, learning and reasoning” [6]. Digital transformation, in general,
denotes adopting digital technologies, such as industry 4.0 related technologies or DTs,
in the digitalization of an organization’s business model and its operations (cf. Sect. 2.3).
© Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 284–299, 2020.
https://doi.org/10.1007/978-3-030-49418-6_19
Supporting Early Phases of Digital Twin Development 285
Several researchers emphasize the importance of industry 4.0 for digital transformation
[21] or, vice versa, that digital transformation motivates the implementation of industry
4.0 [20]. However, DTs as an element of digital transformation or digital transformation
as driver for DT development are not included in the aforementioned work. A literature
analysis (see Sect. 5) confirmed that digital twin research predominantly focuses on
technological questions of DT design and operations. So far, organizational and business
model related aspects of DTs are only sparsely covered in research which motivated
this paper. In response to this, the paper’s objective is to investigate how DT solutions
are integrated into organizational structures and business models of manufacturing
enterprises, and what motivates the development of DT from a digital transformation
perspective.
Enterprise Modeling (EM) is a versatile approach and is able to tackle various orga-
nizational design problems by means of multi-perspective conceptual modeling. EM
captures organizational knowledge about the motivation and business requirements for
designing IS [7]. Hence it has the potential of capturing and representing the organi-
zational motivation for DT design. A key aspect of operating and managing DTs is to
configure and adjust them according to the situational changes in operations. Capability
Management, and in particular Capability Driven Development (CDD), has been proven
applicable for managing information systems (IS) in changing context [10]. E.g., CDD
supports generation of monitoring dashboards from models that include context ele-
ments, measurable properties, KPIs as well as rule-based based adjustments based on
context data. In concrete terms, the goal of this paper is to analyze the suitability of EM
and capability management for the purpose of supporting the development and man-
agement of DTs from an organizational perspective. We have chosen the 4EM and CDD
methods for the purpose of this study because they have already established integration
mechanisms between themselves and with other modeling languages.
The rest of the paper is structured as follows. Section 2 gives background to EM,
CDD, and digital transformation. Section 3 describes our research approach. Section 4
presents two case studies. Section 5 summarizes the main requirements for developing
Industry 4.0 solutions found in literature. Section 6 discusses the requirements from
Sects. 4 and 5 with respect to CDD. Section 7 discusses an example of a capability
model for the purpose of DT development. Section 8 provides concluding remarks.
2 Background
2.1 Enterprise Modeling and 4EM
EM is the process of creating an enterprise model that captures all the enterprise’s
aspects or perspectives that are required for a given modeling purpose. An enterprise
model consists of a set of interlinked sub-models, each of them focusing on a specific
perspective, like, processes, goals, concepts, actors, rules, IS components.
286 K. Sandkuhl and J. Stirna
In [10] the concept of capability thinking and a method to capability management are
introduced. It is an organizational mindset that puts capabilities in focus of the business
model and IS development. Capability thinking emphasizes that capabilities are not
self-emergent, instead they should be planned, implemented, controlled, and adjusted.
In doing so they need to be addressed from the perspectives of (1) vision (e.g. goals
and KPIs), (2) enterprise designs such as processes and IS architectures, (3) situation
context incl. measurable properties, as well (4) best practices such as process variants
and patterns for dealing with context changes. Capability as a concept allows reasoning
about these four aspects of the business in an integrated way because enterprises need
to know how to realize the business vision and designs as well as what needs to be
changed depending on real-life situations. The definition of capability is the ability and
capacity that enables an enterprise to achieve a business goal in a certain context [10].
Successful implementation of capability thinking will lead to capability management as
a systematic way to plan, design, develop, deploy, operate, and adjust capabilities.
CDD is a method supporting the four perspectives of capability thinking. CDD con-
sists of a number of method components each focusing on a specific task of the capability
cycle, such as Capability Design, Context Modeling, Patterns and Variability Modeling,
Capability Adjustment Algorithm Specification, as well as method extensions for deal-
ing with certain business challenges such as supporting business process outsourcing
and managing service configuration with the support of open data [17].
digitalization which form the prerequisite for the next step. In [15] we have proposed
the steps for the dimensions of operations and product digitization.
In the operations dimension, the steps are (1) replacing paper documents with dig-
ital representations, (2) end-to-end automated processing of this digital representation
within a process and (3) integration of relevant processes within the enterprise and with
partners. On the product dimension, the departure point for digitization are physical prod-
ucts without built-in information or communication technology. Digitization steps are
(1) to enhance the product/service by providing complementary services (maintenance
information, service catalogs) without actually changing it, (2) to extend functionality
and value proposition of products by integration of sensors and actuators, and (3) redef-
inition of the product or service which leads to a completely new value proposition. A
completed digital transformation requires all three steps in both dimensions.
3 Research Approach
This study is part of a research program aiming to provide methodological and tool
support for organizations in dynamic contexts, e.g., supporting the process of digital
transformation and capability management. It follows the five stages of Design Science
research [16], namely, problem explication, requirements definition, design and devel-
opment of the design artifact, demonstration, as well as evaluation. This study concerns
the first two steps for the design artifact supporting DT design and management from an
organizational perspective. This part of our research started from the following research
question which is based on the motivation presented in Sect. 1: RQ: In the context of
digital transformation, how are digital twin initiatives emerging and what are the driving
forces for starting implementation projects?
The research method used for working on this research question is a combination of
literature study and descriptive case study. Based on the research question, we identified
industrial cases of digital transformation suitable for studying the origin of DT develop-
ments, i.e. we performed qualitative case studies in order to obtain relevant and original
data (see Sect. 4). Qualitative case study is an approach to research that facilitates explo-
ration of a phenomenon within its context using a variety of data sources. This ensures
that the subject under consideration is explored from a variety of perspectives which
allows for multiple facets of the phenomenon to be revealed and understood. Within
the case studies, we used three different perspectives, which at the same time represent
sources of data: we analyzed documents about business models, products, manufactur-
ing process of the companies; we performed workshops targeting digital transformation
and DTs as part thereof; and we interviewed domain experts. Yin [18] differentiates
case studies explanatory, exploratory and descriptive. The case studies in Sect. 4 are
considered descriptive, as they describe the phenomenon of initiating DT development
and the real-life context in which it occurs.
Based on the results of the case studies, primarily case study requirements to DT
development, we selected research areas with relevant work for these requirements and
analyzed the literature in these areas. The purpose of the analysis was to find existing
approaches and methods for modeling DT and how they are integrated into the business.
This work limits the focus on DTs in manufacturing, although they can also be used in
other application fields. To summarize,
288 K. Sandkuhl and J. Stirna
– The case studies explore whether business models and organizational context are really
relevant from industrial perspective. We focus on the early phases of DT realization,
i.e. decision making and specification,
– A literature study explores whether existing research work covers modeling
approaches for business models and organizational context of DT.
The product manager stated as one of the motivations for the workshop: “Our datalink
device is nearly ready. It captures data and puts them into our own cloud. So far, we
only capture data about malfunction or energy consumption that is anyhow visible on
the pump’s display. But we do not have a good idea, how to do business with this data.
And we probably need more sensors.”
Among the top innovation ideas were (a) smart pumps and (b) pumping as a ser-
vice, which the workshop participants both related to the topic of digital twins. When
discussing the smart pump, the sales representative explained: “We think that our bigger
customers want to have control if our pumps do what they are supposed to do in their
installations. Some of them call it the digital twin. This would help us to sell pumps to
them. We have to use or develop sensors that deliver this kind of information.”
Pumping as a service aims at selling the functionality of the pump instead of the
pump as physical device which would lead to a service agreement where the company
is paid for pumped cubic meters or hours of pumping. One of the participants remarked
to this idea: “For this, we need full control what is happening with the pump. So, we
need something a bit like a digital twin, but for our internal purposes.”
When developing the business model prototype for pumping as a service, most of
the discussion time was spent on organizational issues within the company: “where does
all the information from our pumps arrive, how do we make sense out of it and how do
we organize the reaction?” For the smart pumps, the discussion was more about “how
do we integrate our pumps in the DT system of our customer and what kind of sensors
do we need?” Furthermore, the development department mentioned “We would need
to know what technical basis our customers use for their DTs and what interfaces we
have to provide. But most of our customers have no real answers to these questions.
Sometimes we get the impression that they simply don’t know.”
unrealistic changes, like, reduction of production time for forms to 10% of the current
value, no setup time of the production system or internal logistics requiring no staff.
Preparation and execution of the workshop was similar to what was described for the
first case study: the selected participants represented all relevant departments of the com-
pany (design, production, logistics, procurement, human resources, economics, service
and customer care), mostly represented by the head of the unit or senior experts. All ten
participants were informed beforehand about purpose of the workshop, the need to think
“out-of-the-box” and the importance of their participation. The workshop included the
collection and clustering of radical transformation of products and of operations, joint
clustering and definition of priorities. Based on the priorities, an initial evaluation of the
top three options for radical transformation of products and the top three transforma-
tions in operations was done. The content of the workshop was documented in photo
documentation of collected ideas and clusters, written documentation of the evaluation
results, and notes. The workshop was conducted by two researchers: one facilitator and
one note taker. In this paper analyzes the documented content.
For radical transformation of internal operations, one of the clusters identified was
named “digital twin of the own factory”. The primary intention was to always have a real-
time status of all resource in the own production system including facilities, parts and
staff. For the radical transformation of the products, one of the clusters was the DT of each
individual form on the customers’ site. It is expected that a fully digitalized and automated
press shop would need full control and real-time monitoring of the complete production
flow and all components of the press shop. In this regard, the workshop participants
discussed how to set up the cooperation with press manufacturers and logistics companies
to discuss standards for the DT.
company A installed), and c) the combination of a) and b), i.e. the company’s product
monitored in a client’s facility. E.g., the form of company B with remote monitoring
for purposes of preventive maintenance and local monitoring for optimizing production
in the press shop. Options a) and b) require different information to be aggregated,
displayed, and monitored.
CSR3: What aspect of reality has to be represented in a DT depends on the
organizational integration and the intended business model of the company.
CSR1 sates that DTs must be supporting a company’s business model. When imple-
menting business models, this means that the digital twin has to provide the information
about status or operations of the product required for the value creation underlying the
business model. E.g. in case A, pumping-as-a-service requires to capture the perfor-
mance of a pump to be able to invoice the provided hours or pumped volume, the energy
consumption of the pump, and the status of lubricants to avoid problems in the service.
CSR4: Identification of features and parameters that have to be visible in the DT can
be supported by developing business model prototypes and investigating organizational
integration.
In both case studies, the options for new DT-based services were subject to an initial
feasibility study. This study started from the definition of what service has to be provided
for the customer, what information and functionality are required for the services (i.e.,
specification of features and parameters) and how this information is processed and used
in the enterprise to deliver the service (i.e. the organizational processes).
CSR5: Component developers request a better methodical and technical integration
of DTs (platform) development and component development.
In particular in case B, the case study company made clear that the development
of a smart form would require collaboration with the manufacturer of the press for
implementing the vision of a smart press shop. In case A, a similar request emerged
when discussing the integration of pumps in complex systems, like, e.g. a cruising ship.
Both cases showed the need for technical agreements (interfaces and platforms) and
methodical agreements with the digital twin provider.
CSR6: Business models and organizational processes are subject to continuous
improvement and so are DT features and parameters.
During development of business model prototypes, in both cases a kind of roadmap
for stepwise implementing and extending services and business model was discussed,
and the actual prototype intended to cover only the first stage. An expectation was
expressed that the first stage would have to be extended based on the feedback of the
customer and lessons learned from operations. With respect to modeling support, our
recommendation is to explicitly model organizational context and business models as
preparation of the DT design.
DTs are usually designed and operated in the context of industry 4.0. In the field of pro-
duction systems, there is a substantial amount of work on DTs. In the context of this paper,
292 K. Sandkuhl and J. Stirna
the intersection of digital transformation and DT as industry 4.0 solution is most relevant.
Mittal et al. [20] investigated what manufacturing small and medium-sized enterprises
(SME) need to successfully adopt industry 4.0. As a result, 16 specific requirements for
SME were identified including smart manufacturing solutions for specialized products,
which includes DTs. Schumacher et al. [21] proposed a maturity model for assessing
Industry 4.0 readiness and identify nine dimensions of maturity and 62 maturity items in
their Industry 4.0 Maturity Model. The maturity items include technology and product
related aspects, like digitalization of products, product integration into other systems,
and DTs. Considering the objective of this research our primary focus is on supporting
the fit of the DT to the organization’s needs in the industry 4.0 program, which, as dis-
cussed previously, can be supported by modeling. There have been several investigations
of the needs for modeling support for industry 4.0. Hermann et al. [9] present four main
principles of industry 4.0, namely:
Wortmann et al. [8] report on a systematic literature review and in terms of the
expected benefits for modeling for industry 4.0 puts forward the following: reducing
time (development time, time-to-market), reducing costs (of development, integration,
configuration), improving sustainability, and improving international competitiveness.
This is in line of what are the general intentions of allying development methods and
tools. In the context of industry 4.0 modeling addresses cyber aspects, physical aspects,
or cyber-physical aspects of which the latter is the least researched and for which the least
number of contributions have been elaborated. Wortmann et al. indicate that the current
trends include methods for modeling digital representation, failure handling, human
factors, information management, integration, process, product, configuration validation
and verification, as well as visualization. The areas of product modeling, validation
and verification, and information management attracting the most attention right now.
Human factors and visualization are addressed by considerably fewer contributions.
However, this study focused mostly on methods that have proven useful for IS design
Supporting Early Phases of Digital Twin Development 293
and development, and these methods do not support a holistic view on design that
integrates organizational and human aspects with the more common IS aspects.
The analysis of the current state of modeling for the purpose of designing industry
4.0 solutions, including DTs, calls for a number of areas of advancements, as follows.
Concerning modeling and model management:
– Support for integrated multi-perspective views on all aspects of, such as, business and
organizational, IS architecture, implementation, and operation at runtime.
– Integration of different artifact kinds such as models, 3D drawings. In this regard,
Wortmann et al. call for the integration of models in the engineering, deployment
process, and operation processes. To achieve alignment, the integration should start
with the business design and requirements for the engineering process.
– Supporting design models with runtime data and, consequently, extracting models
that can be used in later design iterations from runtime data. Using runtime data for
the purpose of assessing the performance of designs, especially reusable designs that
are applied in several operational installations.
– Support for adaptation and adjustment of the solution according to changing business
goals and requirements as well as application context.
– The solution should have built-in means for runtime adaptations that do not require
re-design and re-deployment.
With respect to the latter, [8] discuss the possibility of adopting the DevOps principles
for developing industry 4.0 solutions. The proposed vision for such a lifecycle is similar
to the CDD process [10], discussed in Sect. 6.
6 Analysis
First, we will discuss the requirements from Sect. 5 and how the three main topics of (1)
modeling and model management, (2) adaptation and adjustment; and (3) continuous
lifecycle management, can be addressed by EM and CDD. This will be followed by a
discussion of the case study requirements.
294 K. Sandkuhl and J. Stirna
– Enterprise models to capture the business models. Later they can be linked with the
DT design models repressing the technical details of the DT.
– Capability design models to represent the more detailed designs of the DT.
Supporting Early Phases of Digital Twin Development 295
Table 1. Requirements from case studies supported by the CDD method components
– Context models to show the dependence on local and global data in the environment
as well as to adjustments of the DTs and their monitoring dashboards.
– The capability design models and the enterprise models need to be linked to establish
the business motivation and fit of the DT.
– Capability designs and context models should be used for generating dashboards for
DT management. Key data types that have the potential of being useful here are context
data, KPI, historical data about performance of reusable components.
– The models used need to be reasonably open and extendable in order to be able to
incorporate additional perspectives of modeling.
296 K. Sandkuhl and J. Stirna
Concerning requirements CSR5 and CSR6, they can be supported by the CDD’s method
components as discussed in the previous section, but they also call for the establishment of
a new way of working. It needs to support the core tasks of development and management
of efficient DTs, such as, capturing the business motivation, design of the DT, and delivery
and operation of the DT. The CDD process, which shares similarities with the DevOps
principle of continuous development and operation, has the potential of being adapted for
this purpose. Wortmann et al. [8] also call for this kind of approach to DT development
and operation. The case study requirements suggest that to make the DTs more fitting
to the business model, explicit focus should be on the issues such as business goals,
processes, and integration with the IS architecture. These are issues suitable for EM.
Figure 1 proposes a DT development and management lifecycle that incorporates three
sub-cycles – EM, DT Design, and Delivery and Operation. The internal steps and tasks
in the sub-cycles follow the established procedures in [7] for EM and in [10] for Design
and Operation. The following artifacts support the transition between the sub-cycles
(grey arrows in Fig. 1):
– EM provides explicated knowledge about the business motivation for the DT in the
form of enterprise models.
– Capability design provides (1) capability based digital twin design that are executable
in the sense that they are integrated with the physical twins, and (2) generates the
monitoring applications for digital twin management from the context model.
– The delivery and operation sub-cycle provides data types (e.g. context element types,
measurable properties, KPIs) of available data used at runtime of the digital twin.
This allows extending the existing designs as well as selecting existing and obtainable
data in new designs. The Design provides best practices and reusable components on
which the EM sub-cycle can base new business developments.
7 Feasibility Demonstration
Figure 2 illustrates the feasibility of the CDD use with a fragment of a capability model
consisting of goals, capabilities, and context modeling elements. The digital transfor-
mation workshop at company A identified an option to develop a pump-as-a-service
product. When prioritizing the options, this option was top rated and, hence, converted
into Goal 1 to develop pumping-as-a-service. It was refined into three sub-goals aiming
at low maintenance pumps (1.1), possibility for real-time monitoring (1.2) and develop-
ment of a preventive maintenance service (1.3). KPIs were set for all three sub-goals.
The goal model is shown on the right side of Fig. 2 and follows the 4EM notation.
need to be monitored. For brevity reasons, context and KPI calculations as well as the
operational processes linked to the capabilities are not shown in the model.
References
1. IIRA: Technology Working Group Industrial Internet Consortium, IIC Architecture Task
Group: The Industrial Internet Reference Architecture v1.8 (2017)
2. Strange, R., Zucchella, A.: Industry 4.0, global value chains and international business.
Multinational Bus. Rev. 25(4) (2017)
3. International Electrotechnical Commission: IEC PAS 63088:2017. Smart Manufacturing -
Reference Architecture Model Industry 4.0 (RAMI4.0) (2017)
4. Nardello, M., Han, S., Møller, C., Gøtze, J.: Automated Modeling with Abstraction for
Enterprise Architecture (AMA4EA): Business Process Model Automation in an Industry
4.0 Laboratory, CSIMQ, no. 19 (2019)
5. Sniderman, B., Mahto, M., Cotteleer, M.: Industry 4.0 and Manufacturing Ecosystems:
Exploring the World of Connected Enterprises. Deloitte University Press (2016)
6. Bolton, R., et al.: Customer experience challenges: bringing together digital, physical and
social realms. J. Serv. Manag. 29(5) (2018)
7. Sandkuhl, K., Stirna, J., Persson, A., Wißotzki, M.: Enterprise Modeling. Tackling Business
Challenges with the 4EM Method. Springer, Heidelberg (2014). https://doi.org/10.1007/978-
3-662-43725-4
Supporting Early Phases of Digital Twin Development 299
8. Wortmann, A., Barais, O., Combemale, B., et al.: Modeling languages in Industry 4.0: an
extended systematic mapping study. Softw. Syst. Model. 19, 67–94 (2020)
9. Hermann, M., Pentek, T., Otto, B.: Design principles for Industrie 4.0 scenarios. In:
Proceedings of the HICSS 2016. IEEE (2016)
10. Sandkuhl, K., Stirna, J. (eds.): Capability Management in Digital Enterprises. Springer, Cham
(2018). https://doi.org/10.1007/978-3-319-90424-5
11. Frank, U.: Multi-perspective enterprise modeling: foundational concepts, prospects and future
research challenges. Softw. Syst. Model. 13(3), 941–962 (2012)
12. Lillehagen, F., Krogstie, J.: Active Knowledge Modeling of Enterprises. Springer, Heidelberg
(2018). https://doi.org/10.1007/978-3-540-79416-5
13. Hirsch-Kreinsen, H., ten Hompel, M.: Digitalisierung industrieller Arbeit: Entwicklungsper-
spektiven und Gestaltungsansätze. In: Handbuch Industrie 4.0 Bd.3 (2017)
14. Berman, S.J., Bell, R.: Digital transformation: creating new business models where digital
meets physical. In: IBM Institute for Business Value (2011)
15. Sandkuhl, K., Shilov, N., Smirnov, A.: Facilitating digital transformation by multi-aspect
ontologies: approach and application steps. IJSM (2020)
16. Johannesson, P., Perjons, E.: An Introduction to Design Science. Springer, Cham (2014).
https://doi.org/10.1007/978-3-319-10632-8
17. Kampars, J., Zdravkovic, J., Stirna, J., Grabis, J.: Extending organizational capabilities with
open data to support sustainable and dynamic business ecosystems. Softw. Syst. Model (2019).
https://doi.org/10.1007/s10270-019-00756-7
18. Yin, R.K.: Case Study Research: Design and Methods. SAGE Publications, Thousand Oaks
(2002)
19. Kampars, J., Stirna, J.: A repository for pattern governance supporting capability driven
development. In: CEUR Workshop Proceedings, pp. 1–12 (2017)
20. Mittal, S., Khan, M.A., Romero, D., Wuest, T.: A critical review of smart manufacturing &
Industry 4.0 maturity models: implications for small and medium-sized enterprises (SMEs).
J. Manufact. Syst. 49, 194–214 (2018)
21. Schumacher, A., Erol, S., Sihn, W.: A maturity model for assessing Industry 4.0 readiness
and maturity of manufacturing enterprises. Procedia Cirp 52(1), 161–166 (2016)
Integrated On-demand Modeling
for Configuration of Trusted ICT Supply Chains
Jānis Grabis(B)
1 Introduction
Digital enterprises increasingly rely on smart and intelligent decision-making based on
non-trivial computations and complex algorithms (Carlsson 2018). Decision-making is
perceived as a selection of business process execution alternatives (e.g., accept or decline
a customer request in Customer Relationships Management system) or finding values of
quantitative decision variables (i.e., how many products to order in Warehouse manage-
ment System). It is assumed that decisions are made using a kind of algorithm or model
referred as to decision-making model and the process of creating and using the model is
referred as to decision-modelling. There are different types of decision-making problems,
for example, capacity planning, supplier selection, fraud detection. Decision-making
models have been developed for these typical decision-making problems, however these
models still need to be adapted for specific use cases.
Decision-modelling often is time consuming and complex endeavor while dynamic
operations increasingly require decision-making capabilities provided on short notice
for solving short life-cycle decision-making problems (Halpern 2015), for example,
© Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 300–307, 2020.
https://doi.org/10.1007/978-3-030-49418-6_20
Integrated On-demand Modeling 301
selection of the delivery mode in hyperconnected systems (Sallez et al. 2016). That
creates a need for on-demand decision-modelling (ODDM) allowing for quick model
development, deployment and execution for a particular short-life cycle application case.
It is proposed that ODDM models can be developed by integrating various mod-
elling paradigms. In particular, mathematical programming models are prescriptive mod-
els allowing to generate the optimal solution for the decision-making problem though
requiring significant development expertise and effort. Data analytical models (e.g.,
regression, deep neural networks) on the other hand have predictive capabilities and
given their structure and data can be estimated on-demand. Therefore, we propose to
create ODDM models as integrated models or iODDM. On-demand models are envi-
sioned to have multiple applications. The proposed research will focus specifically on
configuration of Information and Communication Technology (ICT) supply chains (SC)
what is a topical problem in practice (Kshetri and Voas 2019).
Challenges associated with intelligent applications and on-demand decision mod-
elling can be summarized as follows:
To address these challenges, the goal of the proposed research is to enable devel-
opment of on-demand decision-making models for configuration of trusted ICT SCs by
integrating three modeling paradigms - enterprise modelling, mathematical program-
ming and big data analysis. Enterprise models represent the decision-making problem
in the enterprise context to facilitate integration, mathematical programming models
capture underlying structure of the decision-modeling problem and big data analysis or
data analytical models account for case specific variations in data structures and content.
Expected scientific contributions are: 1) formulation of the general iODDM model and
a new type of SC configuration model; 2) method for development of iODDM models
as a part of decision-making information systems; 3) extension of enterprise modelling
to support representation of trusted SC modelling aspects; and 4) big data processing
pipeline for decision-modelling data supply.
2 Related Work
Integrated on-demand modeling and configuration of trusted ICT SC are two key areas
of innovation of this proposal. Integrated and hybrid modelling is often used to attain
benefits brought by different modelling paradigms (Chandra and Grabis 2016a). Com-
plexity of current decision-making models also requires an appropriate development and
execution environment (Chen and Zhang 2014).
302 J. Grabis
Models are typically integrated using a staged approach when one model serves as
an input to another model (Chandra and Grabis 2016b). Optimization models are often
combined with simulation models (Amaran et al. 2016). However, simultaneous run-
ning of mathematical programming and data analytical models is rarely considered. Kuo
et al. (2015) provide one example where data analytical models feed mathematical pro-
gramming models in real-time. The proposal’s authors have made early attempts to inte-
grate mathematical programming models and data analysis models (Grabis et al. 2012)
though that is done without generalizing the integration approach. Ning and You (2019)
point out that further development of integrated models requires better closed-loop feed-
back mechanisms. From the computational perspective, specification of decision-making
components using domain specific languages allows development of reusable compo-
nents pluggable in different enterprise applications (Brodsky et al. 2015). Wei and Ma
(2014) integrate product configuration, production planning and production execution on
the basis of the ERP system. Decision Model and Notation allows representing decision-
making logics in business processes (Hasic et al. 2018). However, these technologies
are often restricted to specific types of decision-models and do not cover the full-cycle
of integrated modelling, especially, integration with enterprise modelling or on-demand
modelling data provisioning. There is an increasing interest in modelling as a service
though practical applications are mainly restricted to descriptive models (e.g., Schuff
et al. 2018).
Mathematical and simulation models mainly have been used in SC configuration
(Chandra and Grabis 2016b) and qualitative and descriptive techniques have been used
to evaluate SC risks (Sigler et al. 2017). Patrignani and Kavathatzopoulos (2018) point
out that complexity of ICT systems pose new challenges not addressed by the existing
methods. Baryannis et al. (2019) conclude that artificial intelligence could help to address
these concerns though comprehensive methods are yet to be developed.
The proposed research will extend the state of the art by elaborating a new type
of integrated decision-making models, which fuse mathematical programming and data
analytics in a generalized manner and are configurable according to enterprise models.
The model when tailored to the trusted ICT SC configuration problem will be provide
a novel solution to configuration of this type of SC using mathematical programming
for typical SC configuration decisions and data analytics for risk and trustworthiness
evaluation.
3 Model Formulation
model to rise level of abstraction and improve integration with the decision support
information system. The generic model captures core aspects of decision making and
is formulated as a mathematical programming model while the data analytical model
provides case specific customization using models fitted for the specific case.
analytical models. Various types of data analytical models could be used ranging from
linear regression model to neural networks. The model can be estimated on-demand and
represents security and reliability concerns relevant to particular case.
The specific features to ICT SCs are representation of complex product structure,
combination of virtual and physical aspects, and domain specific performance and quality
attributes such as data security, performance and reliability attributes and licensing.
The enterprise model provides means for business analysts to configure iODDM. It
represents the case specific goals and their KPI as well as restricts a number of plausible
scenarios. For instance, configuration of SC for a new product has fewer constrains on
decision variables than changing one of the nodes of the existing SC. The decision-
making information system provides data needed for estimation and operation of data
analytical models in a speedy and scalable manner.
It is important to note that data analytical models are an integral part of the model
rather than just input data providers as it is a case in traditional multi-stage modeling.
Results of the data analytical modeling are recalculated in every model solving iteration.
Depending on the type of analytical models used traditional or non-parametric (e.g.,
genetic algorithms) will be used to solve the model.
4 Modeling Process
The iODDM model should be available for decision-making within a short time period
measured from couple of minutes to one week. The method supports all steps of model
development and includes three main stages: 1) Model development; 2) on-demand
configuration; and 3) model execution and adaptation (Fig. 3). The model development
stage concerns creation of the iODDM for the decision-making problem. Enterprise
modeling techniques are used to specify modeling goals, constraints and data entities
characteristic for the decision-making problem. The mathematical programming model
is also formulated and it is linked with the enterprise model. The data entities represent
data requirements for decision-making purposes. Usage of data analytical models is
represented in an abstract form.
Whenever a need arises for an on-demand modeling, the model developed is con-
figured in the second stage of the method. The enterprise model is changed to represent
Integrated On-demand Modeling 305
case specific objectives and constraints and these changes are represented in the decision-
making model by means of transformations. Case specific data sources are bound to the
data requirements. That includes specification of data transformation procedures. It is
important to note that both data streams and batch data can be used as data sources. The
abstract data analytical models are instantiated using the case specific data and are made
ready for usage in the optimization model.
The model execution stage concerns actual usage of the models. Two main challenges
addressed during this stage are monitoring of data sources and scalability what is sup-
ported by the modeling platform. Additionally, adaptation of the modeling parameters
also could be performed.
5 Tool Support
• ODDM core - the central element of the architecture. It provides a web-based user
interface that is used for iODDM model authoring and deployment. After models
have been deployed and run various model related performance dashboards can be
monitored in ODDM user interface. ODDM core also takes care of the infrastruc-
ture management and configuration of the remaining components of the ODDM
architecture.
• Data ingest - used for ingesting data into the ODDM stream processor. Data ingestion
in Fig. 4 is marked with #1. Typical data sources are open data, internet of things and
other types of data streams. The intelligent application itself can serve as a data source
(e.g. passing in streams of log files or transactions).
• Stream processor - used for processing ingested data streams (#2), which serve as the
input for the iODDM model, and executing the model itself. Stream processor also
aggregates the input data up to the defined level of granularity and persists it in the
306 J. Grabis
ODDM core
Modeling authoring Dashboards & KPIS
Infrastructure management RunƟme adjustments
1
Stream 6 Model repository
Open data 7
processor Models as paƩerns
2 Model Intelligent
1 execuƟon 5 Decision as a ApplicaƟon
IoT Data Data Batch processor Service
ingest archiving Model tuning
1
Data 3 4
streams Data lake
data lake (#3). During the model execution, Runtime adjustments are triggered to pass
decision-modeling results to consumers, e.g., supply chain (#7).
• Batch processor - provides batch processing of the archived data from the data lake
(#4) and fine tuning of model parameters which are then saved in the model repository
(#5). Various machine learning approaches are applied based on the available input
data and model specifics.
• Model repository - provides storage of the integrated on-demand decision-making
models in a form of reusable patterns that are then used by the Stream processor (#6).
The repository also accumulates models’ usage performance data. Model performance
is measured under specific contextual situation (defined by the input data) and Stream
processor is able to switch to a more appropriate alternative model if the context would
change.
• Data lake - a distributed, horizontally scalable persistent data store that integrates with
the Stream processor (#3) and batch data processor (#4).
6 Conclusion
The paper describes an idea of creating iODDM models what includes both the modeling
approach and tool support to enable complex computations required. The immediate
steps of further research are development of case specific models and generalization of
the observations made.
The proposed approach has a number of potential risks. That includes proving utility
of using enterprise models to configure analytical models, data availability and stability
of mathematical models with data modeling augmented constraints. Additionally, the
proposed method relays having sufficient level of similarity among the case specific
models and reusability of modeling components.
Integrated On-demand Modeling 307
References
Amaran, S., Sahinidis, N.V., Sharda, B., Bury, S.J.: Simulation optimization: a review of algorithms
and applications). Ann. Oper. Res. 240(1), 351–380 (2016)
Brodsky, A., Krishnamoorthy, M., Menasce, D.A., Shao, G., Rachuri, S.: Toward smart manufac-
turing using decision analytics. In: Proceedings of 2014 IEEE International Conference on Big
Data, p. 967 (2015)
Baryannis, G., Validi, S., Dani, S., Antoniou, G.: Supply chain risk management and artificial
intelligence: state of the art and future research directions. Int. J. Prod. Res. 57(7), 2179–2202
(2019)
Carlsson, C.: Decision analytics-Key to digitalisation. Inf. Sci. 460, 424–438 (2018)
Chandra, C., Grabis, J.: Reconfigurable supply chains: an integrated framework. In: Chandra, C.,
Grabis, J. (eds.) SC Configuration, pp. 69–86. Springer, Heidelberg (2016a). https://doi.org/10.
1007/978-1-4939-3557-4_4
Chandra, C., Grabis, J.: Simulation modeling and hybrid approaches. In: Chandra, C., Grabis, J.
(eds.) SC Configuration, pp. 173–197. Springer, Heidelberg (2016b). https://doi.org/10.1007/
978-1-4939-3557-4_9
Chen, C.L.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies:
a survey on big data. Inf. Sci. 275, 314–347 (2014)
Grabis, J., Chandra, C., Kampars, J.: Use of distributed data sources in facility location. Comput.
Ind. Eng. 63(4), 855–863 (2012)
Halpern, F.: Next-generation analytics and platforms for business success: tDWI research report
(2015). www.tdwi.org
Hasić, F., De Smedt, J., Vanthienen, J.: Augmenting processes with decision intelligence: principles
for integrated modelling. Decis. Support Syst. 107, 1–12 (2018)
Kshetri, N., Voas, J.: Supply Chain Trust. IT Prof. 21(2), 6–10 (2019)
Kuo, Y., Leung, J.M.Y., Meng, H.M., Tsoi, K.K.F.: A real-time decision support tool for disaster
response: a mathematical programming approach. In: Proceedings - 2015 IEEE International
Congress on Big Data, BigData Congress 2015, p. 639 (2015)
Ning, C., You, F.: Optimization under uncertainty in the era of big data and deep learning: when
machine learning meets mathematical programming. Comput. Chem. Eng. 125, 434–448 (2019)
Patrignani, N., Kavathatzopoulos, I.: On the complex relationship between ICT systems and the
planet. In: Kreps, D., Ess, C., Leenen, L., Kimppa, K. (eds.) HCC13 2018, vol. 537, pp. 181–187.
Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-319-99605-9_13
Sallez, Y., Pan, S., Montreui, B., et al.: On the activeness of intelligent Physical Internet containers.
Comput. Ind. 81, 96–104 (2016)
Schuff, D., Corral, K., St. Louis, R.D., Schymik, G.: Enabling self-service BI: a methodology and
a case study for a model management warehouse. Inf. Syst. Front. 20(2), 275–288 (2018)
Sigler, K., Shoemaker, D., Kohnke, A.: Supply Chain Risk Management: Applying Secure
Acquisition Principles to Ensure a Trusted Technology Product. CRC Press, Auburn Hills
(2017)
Wei, J., Ma, Y.-S.: Design of a feature-based order acceptance and scheduling module in an ERP
system. Comput. Ind. 65(1), 64–78 (2014)
Software-Related Modeling
(EMMSAD 2020)
A Modeling Method for Systematic
Architecture Reconstruction of
Microservice-Based Software Systems
1 Introduction
Microservice Architecture (MSA) is a novel approach to architecting service-
based software systems that puts a strong emphasis on service-specific indepen-
dence [11]. MSA promotes to (i) tailor services to exactly one, distinct capability;
(ii) shift responsibilities in a service’s design, development, and deployment to
a single team composed of members with heterogeneous professional skills; and
(iii) keep services executable, testable, and deployable in isolation [10,11].
MSA is expected to benefit quality attributes like scalability, maintainability,
and reliability [11]. Thus, it is frequently used to refactor monolithic systems for
which these quality attributes decreased critically [16].
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 311–326, 2020.
https://doi.org/10.1007/978-3-030-49418-6_21
312 F. Rademacher et al.
2 Background
This section presents background information on SAR and an overview of our
modeling languages for viewpoint-based, model-driven MSA engineering.
Service Viewpoint. The viewpoint’s Service Modeling Language [13] targets the
Dev perspective in DevOps-based MSA teams [10]. It enables service developers
to construct service models that specify microservices, interfaces, and endpoints.
Operation Viewpoint. The Operation Modeling Language [13] targets the Ops
perspective [10]. It enables service operators to construct operation models that
describe service deployment and infrastructure for, e.g., service discovery and
monitoring [2].
Figure 1 shows our SAR modeling method in a UML activity diagram. The
sequence of its six activities follows the relationships between MSA viewpoints
(cf. Subsect. 2.2). Each activity targets certain phases of the SAR process (cf.
Subsect. 2.1) and is described in the following subsections together with example
reconstruction models expressed in corresponding modeling languages.
API (JPA)2 are placed before class definitions. Finally, the discovered technol-
ogy information is assigned to the captured domain concept within a mapping
model (cf. Subsect. 2.2). Listing 2 shows a reconstruction domain-related tech-
nology model and mapping model derived during the Domain Modeling SAR
activity.
The Service Modeling activity (cf. Fig. 1) examines the input file set for microser-
vices and related information, which are then to be captured in service models
(cf. Subsect. 2.2). In Java-based microservice architectures such information may
be found, e.g., in classes that employ annotations for web-based data binding
like @RestController and @GetMapping from the Spring3 framework. Docker
Compose4 and build scripts also support microservice identification [1].
Similarly to Activity 2 (cf. Subsect. 3.2), discovered technology information
is handled in a dedicated sub-activity. This service-related technology modeling
sub-activity is entered via activity edge connector “E” (cf. Fig. 1). It proceeds
analogously to the domain-related technology modeling sub-activity of Activity 2
(cf. Fig. 2), but captures newly discovered microservices in service models and
returns to the current method instance via edge connector “F” (cf. Fig. 1).
Listing 3 shows reconstruction models that result during the Service Mod-
eling activity, i.e., a service-related technology model, a service model, and a
mapping model (cf. Subsect. 2.2).
2
https://jakarta.ee/specifications/platform/8/apidocs/javax/persistence/Table.
html.
3
https://spring.io.
4
https://docs.docker.com/compose.
A Modeling Method for Systematic Architecture Reconstruction 317
5
https://docs.docker.com/engine/reference/builder/.
318 F. Rademacher et al.
This activity focuses on discovering technology information (cf. Fig. 1), which
was yet not captured in Activities 2 to 4 (cf. Subsects. 3.2 to 3.4). For exam-
ple, the Spring framework allows for keeping microservice configuration separate
from source code in distinct configuration files. Thus, these files may not have
been examined in Activity 3 and are hence explicitly targeted by the Technical
Refinement activity. In the event of discovering a yet not reconstructed tech-
nology information, the technology modeling sub-activity corresponding to the
type of the new information is invoked via activity edge connectors “B”, “E”, or
“H” (cf. Fig. 1 and the descriptions of the sub-activities in Subsects. 3.2 to 3.4).
The Technical Refinement SAR activity focuses on extending previously cap-
tured reconstruction models. Thus, it specifically targets Phase 3 of the SAR
process (cf. Subsect. 2.1), i.e., the manipulation of derived architecture models.
6
https://www.docker.com.
A Modeling Method for Systematic Architecture Reconstruction 319
4 Validation
In the following, we validate the applicability of our SAR modeling method (cf.
Sect. 3) on a case study microservice architecture (cf. Subsect. 4.1). Moreover, we
illustrate the usage of the reconstruction models in the Post-processing activity
of our method (cf. Subsect. 3.6) on the example of assessing certain indicators
for the risk in technical debt of the reconstructed architecture (cf. Subsect. 4.2).
Activity 1: Preparation. We used the files in the source code folders of LM,
which correspond to the examined microservices (cf. Table 1), and the files on
the top-level folder hierarchy of LM’s repository (See footnote 7), e.g., “docker-
compose.yml”, as input file set. Together, the set comprised 160 files with 8858
lines of code (LOC). Moreover, we created empty technology models for Activi-
ties 2, 3, and 4 (cf. Subsect. 3.1), because we were not aware of the technologies
employed by LM.
10
https://activemq.apache.org.
11
https://www.amqp.org.
322 F. Rademacher et al.
Weak Source Code and Knowledge Management. Our SAR modeling method
does not directly support in assessing this ATD type. However, reconstruction
models provide a well-defined means for documenting views on microservice
architectures (cf. Subsect. 2.1). Consequently, they can accompany centralized
MSA documentation [14] as they capture architecture knowledge in a concise
format.
communication, i.e., REST and ActiveMQ (cf. Activities 3 and 4 in Subsect. 4.1).
We consider LM’s risk in this ATD type to only be slightly increased, as in MSA
it is common to employ at most one protocol for each communication kind [11].
However, our analysis of reconstruction mapping models showed that more
REST operations are invokable via an HTTP method (26) than explicit REST
endpoints were specified (16). Such inconsistencies in services’ communication
specifications are likely to cause communication failures at runtime.
5 Discussion
For the validation of our SAR modeling method (cf. Sect. 4), we executed it man-
ually on the input file set. We then ensured the correctness of the reconstruc-
tion models by comparing them with LM’s documentation and double-checking
their consistency with LM’s source code. Consequently, we perceive our method
to be basically applicable on microservice architectures. Nonetheless, a current
threat to validity is the increased error-proneness given the manual execution
of the method. However, this weakness may be mitigated by employing auto-
mated source code analysis techniques, particularly in SAR Activities 2 to 4 (cf.
Subsects. 3.2 to 3.4). For example, in case of Java-based microservice architec-
tures, class bodies and employed annotations, as well as Dockerfiles in general,
represent valuable analysis targets (cf. Sect. 4).
The input file set selected in Activity 1 of our SAR modeling method (cf.
Subsect. 3.1) depends on the availability of artifacts in the targeted microservice
architecture. For instance, due to the structure of the case study architecture,
the input file set for the method’s validation mainly consisted of Java files (cf.
Sect. 4). Hence, the reconstruction effort was relatively high, because all LOC
needed to be examined. However, source code files that reflect domain concepts or
service implementations may also be replaced, e.g., by concise models of database
structures or API documentation. Like the SAR process (cf. Subsect. 2.1), our
modeling method does not constrain input file types.
In its current form, the SAR modeling method directly aligns its activi-
ties to the viewpoints being addressed by our languages for model-driven MSA
engineering and their relationships (cf. Fig. 1 and Subsect. 2.2). As a result, the
method does not take the perspective of stakeholders like business analysts or
project managers into account, yet. To this end, the set of SAR activities would
need to be extended with modeling approaches tailored to stakeholders, who do
not directly participate in software engineering in the context of MSA. Further
research is necessary to identify the concerns of these stakeholders and derive
corresponding SAR activities.
Since our method anticipates reconstruction of technology information, the
degree of abstraction in reconstruction models may be comparatively close to
that of source code. However, due to the usage of mapping models, reconstruction
domain and service models are basically technology-agnostic (cf. Subsects. 3.2
and 3.3). Thus, the execution of sub-activities, which capture technology infor-
mation, may be omitted in Activities 2 and 3, depending on the goal of the
324 F. Rademacher et al.
conducted SAR process (cf. Subsect. 2.1) and technology information being irrel-
evant to its achievement.
6 Related Work
Alshuqayran et al. [1] conduct an empirical study on eight open source microser-
vice architectures to derive a metamodel for SAR in MSA. They also analyze a
set of heterogeneous input files that contain, e.g., Java source code, build scripts,
and configuration files (cf. Sects. 3 and 4). The derived metamodel is similar to
the ones of our Service and Operation Modeling Languages [13]. However, it does
not support the reconstruction of domain concepts. Furthermore, technologies
like Asynchronous Message Bus are fixed metamodel concepts, while with our
Technology Modeling Language [12] they can flexibly be integrated in recon-
struction models as they occur in input files. In addition, Alshuqayran et al.
do not present a concrete syntax for their metamodel, nor do they specify its
systematic usage in a SAR process like our modeling method.
MicroART [9] is a tool for reconstructing microservice architectures. It
extracts service-related information, e.g., services’ names, ports, and develop-
ers from source code repositories. Moreover, it performs a runtime analysis of
log files in order to determine containers, network interfaces, and service inter-
action relationships. From the gathered information, MicroART instantiates a
model from a specifically designed metamodel. Like our approach, MicroART
is model-based. On the contrary, it does not consider the Domain, Operation,
and Technology viewpoints (cf. Subsect. 2.2) when gathering architecture infor-
mation. Furthermore, a systematic method and concrete syntax for facilitating
architecture analyses is not presented.
Zdun et al. introduce an approach towards assessing MSA conformance [18].
Therefore, existing microservice architectures are reconstructed leveraging a
formal model with MSA-specific component and connector types. MSA con-
formance of reconstructed architectures is then assessed via metrics and con-
straints defined by the relationships between these types. Like for our SAR
modeling method, reconstructed formal models also need to be derived manu-
ally from existing architecture implementations. However, no modeling language
with MSA-specific abstractions is employed to facilitate the creation of the for-
mal models. Moreover, a systematic reconstruction method is not presented and
domain-specific information is not considered.
References
1. Alshuqayran, N., Ali, N., Evans, R.: Towards micro service architecture recovery:
an empirical study. In: 2018 IEEE International Conference on Software Architec-
ture (ICSA), pp. 47–56 (2018)
2. Balalaie, A., Heydarnoori, A., Jamshidi, P.: Microservices architecture enables dev-
ops: migration to a cloud-native architecture. IEEE Softw. 33(3), 42–52 (2016)
3. Bass, L., Clements, P., Kazman, R.: Software Architecture in Practice, 3rd edn.
Addison-Wesley, Boston (2013)
4. Bogner, J., Fritzsch, J., Wagner, S., Zimmermann, A.: Microservices in indus-
try: insights into technologies, characteristics, and software quality. In: 2019 IEEE
International Conference on Software Architecture Companion (ICSA-C), pp. 187–
195 (2019)
5. Daigneau, R.: Service Design Patterns. Addison-Wesley, Boston (2012)
6. Di Francesco, P., Malavolta, I., Lago, P.: Research on architecting microservices:
trends, focus, and potential for industrial adoption. In: 2017 IEEE International
Conference on Software Architecture (ICSA), pp. 21–30. IEEE (2017)
7. Evans, E.: Domain-Driven Design. Addison-Wesley, Boston (2004)
8. Fielding, R.: Representational state transfer. Ph.D. thesis (2000)
9. Granchelli, G., Cardarelli, M., Francesco, P.D., Malavolta, I., Iovino, L., Salle,
A.D.: Towards recovering the software architecture of microservice-based sys-
tems. In: 2017 IEEE International Conference on Software Architecture Workshops
(ICSAW), pp. 46–53 (2017)
10. Nadareishvili, I., Mitra, R., Mclarty, M., Amundsen, M.: Microservice Architecture.
O’Reilly Media, Sebastopol (2016)
11. Newman, S.: Building Microservices. O’Reilly Media, Sebastopol (2015)
12. Rademacher, F., Sachweh, S., Zündorf, A.: Aspect-oriented modeling of technology
heterogeneity in microservice architecture. In: 2019 IEEE International Conference
on Software Architecture (ICSA), pp. 21–30. IEEE (2019)
326 F. Rademacher et al.
13. Rademacher, F., Sorgalla, J., Wizenty, P., Sachweh, S., Zündorf, A.: Graphical and
textual model-driven microservice development. In: Bucchiarone, A., et al. (eds.)
Microservices, pp. 147–179. Springer, Cham (2020). https://doi.org/10.1007/978-
3-030-31646-4_7
14. Soares de Toledo, S., Martini, A., Przybyszewska, A., Sjøberg, D.I.K.: Architec-
tural technical debt in microservices: a case study in a large company. In: 2019
IEEE/ACM International Conference on Technical Debt (TechDebt), pp. 78–87.
IEEE (2019)
15. Taibi, D., Lenarduzzi, V.: On the definition of microservice bad smells. IEEE Softw.
35(3), 56–62 (2018)
16. Taibi, D., Lenarduzzi, V., Pahl, C.: Processes, motivations, and issues for migrating
to microservices architectures: an empirical investigation. IEEE Cloud Comput. 5,
22–32 (2017)
17. Taibi, D., Lenarduzzi, V., Pahl, C.: Continuous architecting with microservices
and DevOps: a systematic mapping study. In: Muñoz, V.M., Ferguson, D., Helfert,
M., Pahl, C. (eds.) CLOSER 2018. CCIS, vol. 1073, pp. 126–151. Springer, Cham
(2019). https://doi.org/10.1007/978-3-030-29193-8_7
18. Zdun, U., Navarro, E., Leymann, F.: Ensuring and assessing architecture confor-
mance to microservice decomposition patterns. In: Maximilien, M., Vallecillo, A.,
Wang, J., Oriol, M. (eds.) ICSOC 2017. LNCS, vol. 10601, pp. 411–429. Springer,
Cham (2017). https://doi.org/10.1007/978-3-319-69035-3_29
Can We Design Software as We Talk?
A Research Idea
1 Introduction
Living in a digital era, service providers are challenged to offer services to
their customers through a wide spectrum of channels. This constant introduc-
tion of new devices and technology challenges organisations to provide rapid
This research project is supported by ZHAW Digital and the Digitalisation Initiative
of Zürich Universities DIZH.
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 327–334, 2020.
https://doi.org/10.1007/978-3-030-49418-6_22
328 M. Ruiz and B. Hasselman
AUTOMATIC SPECIFICATION OF
INTERACTIVE BOARD USER STORIES
SUPERSTORE DEEP
As a <user> REQUIREMENTS USER STORY
LEARNING
I want to <buy> TRANSCRIPT ASSEMBLER
CLASSIFIER
products so my
SuperStore is BUY
AUTOMATIC ONTOLOGY
<well supplied>. TRANSCRIPTION CRAWLER
TOOL
COMPONENT
Sect. 3.2. Finally, we conclude our idea research paper by discussing lessons learnt and
future lines of work in Sect. 4.
2 Related Work
In the field of requirements engineering there are several related work that approach
the challenge of automate requirements specifications from different angles. We anal-
yse these approaches based on: (a) requirements source: audio recordings/transcripts
from requirements meetings, tweets, bug reports, user stories, existing documentation,
domain repository; (b) generated requirements specification in the shape of: meet-
ing minutes, knowledge extraction, tweets classification, relevant topics, remedied user
stories, meeting summaries, and user stories; (c) Existing validation or evaluation: lab-
oratory demonstration, comparative experiment; (d) Existence or not of tool support;
and (e) Whether or has been applied in practice.
Some works focus on supporting software requirements specification by generating
meeting minutes. For instance, Kaiya et al. [9] proposes a tool to support requirement
elicitation meetings by recording the sessions and providing an assistant tool to manage
the recordings and mark the important points via hypertext. Authors conclude that
further collaboration mechanisms need to be incorporated to facilitate real-time edition
of requirements and knowledge share. Murray et al. [10] developed a natural language
processing approach to summarize emails and conversations in general, more projects
involving textual sources appeared. Especially in the field of machine learning were
multiple techniques developed to extract requirements engineering relevant information
from different written origins [4, 11–14].
Rodeghero et al. [13] proposed a machine learning classification algorithm trained
to recognise user stories’ information [15]. As a conclusion of this study, the authors
found out that information about software functionality and requirements rationale can
330 M. Ruiz and B. Hasselman
sessions. Our research strategy is summarised in Fig. 2. In short, our research idea is
to build a deep learning algorithm that can be further trained by providing labeled
requirements elicitation sessions. For identifying missing roles, we propose to make use
of existing ontologies that provide information related to typical roles belonging to the
context in which software elicitation sessions take place.
In this paper, we summarise the deep learning classifier (see Sect. 3.1) and the
ontology crawler components (see Sect. 3.2). For implementation purposes We chose the
Java language as it guarantees portability and its popularity results in maintained and
tested frameworks we can use. It has a sophisticated deep learning framework available
in DL4J1 and the Java OWL API2 for handling Ontology files. The components will
later provide the data to be used by the user story assembler component (out of scope
of this paper).
1
https://deeplearning4j.org/.
2
http://owlcs.github.io/owlapi/.
332 M. Ruiz and B. Hasselman
For our initial development process, we used the smallest set; the “Wikipedia 2014 +
Gigawords”, which consists of 6 billion tokens and a representation of 50 dimensions.
The implementation of the deep learning classifier is available in our public
GitHub repository at https://github.com/lmruizcar/requirements classifier. An exam-
ple of classification is presented in Fig. 3. The model in its current state performs about
as well as random guessing since we need data for training purposes. As it has been
mentioned by [16], the lack of data from requirements elicitation sessions is an obstacle
in this type of investigations. Our model differentiates between three labels: None (0),
Non-Functional (1) and Functional (2). A caveat of this deep learning approach is, that
it only cares indirectly for the fact that turns can be both; labelled 1 and 2. Whereas
[13] built multiple binary classifiers which each analysed the turn, our approach uses a
SoftMax layer for which the output is interpret able as probabilities. A turn that falls
into both categories, would have probabilities around 0.5 for both labels which can be
interpreted individually, but is not represented in the standard evaluation method of
machine learning classifiers.
Fig. 4. Example of executing the ontology crawler component in the context of the
SmallShop case
products. And for returning customers it is good practice to store relevant information
like the shipping address in a user account.
References
1. Gebhart, M., Giessler, P., Abeck, S.: Challenges of the digital transformation in
software engineering. In: The Eleventh International Conference on Software Engi-
neering Advances (ICSEA) (2016)
2. Mund, J., Femmer, H., Mendez, D., Eckhardt, J.: Does quality of requirements
specifications matter? Combined results of two empirical studies (2017). https://
arxiv.org/pdf/1702.07656.pdf
3. Chakraborty, A., Baowaly, M., Arefin, A., Bahar, A.: The role of requirement
engineering in software development life cycle. J. Emerg. Trends Comput. Inf. Sci.
3, 723–729 (2012)
334 M. Ruiz and B. Hasselman
4. Dalpiaz, F., Brinkkemper, S.: Agile requirements engineering with user stories. In:
26th International Requirements Engineering Conference (RE), Banff, AB, Canada
(2018)
5. Wagenaar, G., Overbeek, S., Lucassen, G., Brinkkemper, S., Schneider, K.: Work-
ing software over comprehensive documentation - rationales of agile teams for arte-
facts usage. J. Softw. Eng. Res. Dev. 6, 7 (2018). https://doi.org/10.1186/s40411-
018-0051-7
6. Wüest, D., Seyff, N., Glinz, M.: FlexiSketch: a lightweight sketching and meta-
modeling approach for end-users. Softw. Syst. Model. 18(2), 1513–1541 (2017).
https://doi.org/10.1007/s10270-017-0623-8
7. Damian, D., Zowghi, D.: RE challenges in multi-site software development orga-
nizations. Requir. Eng. J. 8, 149–160 (2003). https://doi.org/10.1007/s00766-003-
0173-1
8. Wüest, D., Seyff, N., Glinz, M.: Sketching and notation creation with FlexiSketch
team: evaluating a new means for collaborative requirements elicitation. In: 23rd
International Requirements Engineering Conference (RE), Ottawa (2015)
9. Kaiya, H., Saeki, M., Ochimizu, K.: Design of a hyper media tool to support
requirements elicitation meetings. In: Proceedings Seventh International Workshop
on Computer-Aided Software Engineering (1995)
10. Murray, G., Carenini, G.: Summarizing spoken and written conversations. In: Con-
ference on Empirical Methods in Natural Language Processing, PA, USA, Strouds-
burg (2008)
11. Guzman, E., Ibrahim, M., Glinz, M.: A little bird told me: mining tweets for
requirements and software evolution. In: 25th International Requirements Engi-
neering Conference (RE) (2017)
12. Rastkar, S., Murphy, G.C., Murray, G.: Summarizing software artifacts: a case
study of bug reports. In: Proceedings of the 32nd International Conference on
Software Engineering, NY, USA, New York, vol. 1 (2010)
13. Rodeghero, P., Jiang, S., Armaly, A., McMillan, C.: Detecting user story informa-
tion in developer-client conversations to generate extractive summaries. In: 39th
International Conference on Software Engineering (ICSE) (2017)
14. Abad, Z.S.H., Gervasi, V., Zowghi, D., Barker, K.: ELICA: an automated tool for
dynamic extraction of requirements relevant information. In: International Work-
shop on Artificial Intelligence for Requirements Engineering (AIRE) (2018)
15. Krasniqi, R., Jiang, S., McMillan, C.: TraceLab components for generating extrac-
tive summaries of user stories. In: International Conference on Software Mainte-
nance and Evolution (ICSME) (2017)
16. Rodeghero, P.: Behavior-informed algorithms for automatic documentation gen-
eration. In: International Conference on Software Maintenance and Evolution
(ICSME) (2017)
17. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word repre-
sentation. In: Conference on Empirical Methods in Natural Language Processing
(EMNLP), Doha, Qatar (2014)
Non-Functional Requirements Orienting
the Development of Socially Responsible
Software
1 Introduction
Today, software is embedded in almost everything we buy or use daily in our lives.
In recent years, AI has been increasingly used to deliver solutions in many different
commercial and regulatory domains, from personal assistance devices such as Alexa1
1 https://developer.amazon.com/en-US/alexa.
to face recognition technologies used by law enforcement agencies. However, the use
of AI raises doubts in the mind of consumers regarding how much we can trust AI2 to
make decisions on our behalf. The lack of trust seems to be more prevalent in mission-
critical systems where personal safety is in the care of the machine. Kolm shows [1]
that 70% of Canadians are comfortable with AI scheduling appointments, but only 39%
feel comfortable with AI piloting autonomous vehicles. Therefore, we believe that the
software development process needs to address ways to assure consumers they can trust
the software embedded in the products they are buying and/or using.
Applications utilizing concepts related to Internet of Things, cloud services, and
mobile technologies will raise similar concerns aggravated with the expectation of pri-
vacy and safety, triggering ethical questions that will directly impact how much cus-
tomers can trust their devices. Although there are works [2, 3] tackling trust related to
machine learning and decision support systems, they look at trust in a single dimension
and do not capture the consequences of trust from a social perspective.
Our work aims to consider trust of AI-based software from a citizen’s viewpoint,
using the metaphor of Corporate Social Responsibility (CSR)3 . In the past two decades,
many works have pointed out that CSR goes a long way in promoting positive outcomes
such as loyalty, repeat business, and purchase intention [4, 5]. Furthermore, CSR efforts
may also positively impact the market value of companies that are perceived to be
committed to social responsibility [6]. One important aspect of CSR is that its adoption
reduces information asymmetry [7], and as such, it brings out transparency. It is to note
the tangling effect of CSR in the broader concept of Corporate/Company Reputation
[30].
One of the main reasons for consumers to value CSR is because it promotes intrinsic
trust in the company. One way of looking at trust is to measure how much a consumer
thinks a company can be deemed reliable in situations entailing risk to the consumer.
One critical factor is how much consumers believe the company’s actions and behaviors
have the consumers’ interest and welfare in mind [9].
This work builds an initial argumentation of why using CSR knowledge does help
software engineers develop trustful Software. The benefits of using CSR concepts would
be twofold i) Develop trustworthy systems that would help to retain customers and to
increase market share ii) Use this trustworthy as the basis for developing a socially
responsible system that is likely to be in high demand in the near future.
In this idea paper, we present the foundation for our ongoing work. We are tackling
what we believe will be the core requirements to deliver trustworthy software. We asso-
ciate these requirements with the perspective of society, in general, to be able to opt for
socially responsible software together with the goal of repetitive business and improving
market share. We hope to inspire other researchers to explore similar paths.
2 https://www.nytimes.com/2020/01/18/technology/clearview-privacy-facial-recognition.html.
3 The Business Domain debates over the similitudes/differences among the acronyms CS
(Corporate Sustainability) and CSR. We side with those that consider CS and CSR as synonyms.
NFR Orienting the Development of Socially Responsible Software 337
2 Method
We carried out a brief literature review starting with the CSR domain to investigate
the qualitative properties used in their experience to promote corporate social respon-
sibility that could be adapted to the software domain. We used keyword searches such
as (csr AND trust), (corporate social responsibility AND Trust) from 2015 to 2019.
We recovered 112 publications. After removing duplicated references, snowballing, and
examining the abstracts, we reviewed 22 references. Our choice was based on the linkage
of CSR and Trust.
We elicited knowledge from these references to create a matrix with the most often
mentioned properties expected to be present in companies adopting the CSR approach.
Following, we analyzed which of those properties should be implemented in company’s
Information Systems, in general, to contribute to a trustworthy and socially responsible
environment. Using an NFR (Non-Functional Requirements) perspective [21], we found
three NFRs: Trust, Ethics, and Transparency in these properties. Using earlier knowl-
edge [22, 27], we searched for NFRs which may interact either positively or negatively
with these three NFRs, and we elicited Privacy, Safety, and Security.
Our aim was not to build software to support CSR adoption by companies. We studied
the CSR approach as a basis to propose which would be the critical qualitative properties
to develop trustworthy Information Systems software, based on the company’s goal. As
such, these qualities would stand as essential NFRs to develop software to support the
delivery of socially responsible Information Systems either as a goal by its own or as
part of the adoption of CSR. At the same time, companies that are interested on a stricter
approach to CSR will also have a solid base to start from.
3 Results
As pointed out by Vlachos et al. [10] and others, trust is central for companies to be
perceived as socially responsible. By the same token, we set trust as the primary NFR
to be satisfied, i.e., satisfied within acceptable limits. Our reasoning follows the idea
that safety-critical systems and any software using AI as well as advanced forms of
technology such as internet of things and cloud systems, will inherently trigger fear
in many customers who are forced to relinquish their safety to a machine or to face
unwanted misuse of their behaviors and preferences and sometimes both
In corporate domains, trust would come when providers demonstrate ethical behavior
enforced in their software. Bowen illustrates such a scenario for safety-critical systems
[11]. In order to promote trust, software engineers need to take a bottom-up approach in
developing domain-specific knowledge of elements that build the foundations of trust.
Trust is also frequently linked to the concept of ethics. Consumers tend to trust
companies that they perceive as ethically sound [12]. Nevertheless, consumers also
identify and believe that companies are following moral standards if they are transparent
in the way they do business [6]. The ISO 26000 standard points out how businesses and
organizations should handle ethical and transparent concerns to act responsibly [13]. We
believe that a similar perspective could be applied to software development in general.
Ethics helps to promote trust [5, 14], together with the understanding that safety-
critical systems need to demonstrate ethical behavior to be accepted by consumers.
338 L. M. Cysneiros and J. C. S. do Prado Leite
4 Recent work [31] brings an operationalization perspective on how to use goal models to define
systems considering privacy, security and trust.
NFR Orienting the Development of Socially Responsible Software 339
arise due to suspect software behavior. In Fig. 1, a SIG maps the interactions among
Softgoals (NFRs) [21]; so Ethics, Safety, Privacy, and Security contribute positively
(Help) for Trust, but it needs Transparency to allow the contribution to be effective in
mitigating legal disputes. We certainly acknowledge that other NFRs will also play a
relevant role in distinct types of applications, like Reliability, for instance. Nevertheless,
we believe the NFRs illustrated in Fig. 1 is the anchor we need to carefully elicit and
model operationalizations for developing software, that people can trust.
4 Conclusion
Society has been changing and evolving at a fast pace. Ubiquitous computing, massive
social connection, and growing use of AI/ML (Machine Learning) quite often linked to
IofT concepts have been pushing software development to a new paradigm. In a recent
paper [8] Agrawal et al. stated: “Machine Learning models are software artifacts derived
from data”. More then ever, we can not afford to build software targeting one single
scenario of use. New software may have immense social impact with legal implications.
We need to move our practice to embrace this new scenario where we must build software
that is trustworthy and can be accountable for behaving in a socially responsible way.
Our contribution relies on eliciting, from social sciences, basic qualities for socially
responsible software to be represented as SIG catalogues, anchored on the NFR Frame-
work [21]. We will be developing catalogues to capture as many as possible solutions
(operationalizations) to each NFR illustrated in Fig. 1. We aim to research systematic
ways to search and find satisficing solutions to each of the above NFRs and integrate
these solutions into a software reuse processes, taking in consideration how each possi-
ble solution will impact other NFRs. We will revisit and extend existing catalogues such
as Leite’s transparency [28] and Zinovatna’s privacy and transparency [22], as well as
exploring existing operationalizations, such as [31]. We will also focus on better under-
standing the implications of ethical concepts in the development of software and how it
would impact trust as well as its legal ramifications. That will lead to investigate personal
and group values that are closely related to ethics aspects [29].
At the core of our research, trust is the primary goal to be achieved. If consumers can
trust your company and, by extension, your products (software), they tend to become
loyal to your brand and refer your products to acquaintances, which in a social network
era can translate into benefits, avoiding legal disputes.
References
1. Kolm, J.: How comfortable are Canadians with AI? strategy. http://strategyonline.ca/2017/
12/14/how-comfortable-are-canadians-with-ai/. Accessed 13 Nov 2018
2. Bussone, A., Stumpf, S., O’Sullivan, D.: The role of explanations on trust and reliance in
clinical decision support systems. In: Proceedings - 2015 IEEE International Conference
on Healthcare Informatics, ICHI 2015, pp. 160–169. Institute of Electrical and Electronics
Engineers Inc. (2015). https://doi.org/10.1109/ICHI.2015.26
3. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?” Explaining the predictions of
any classifier. In: Proceedings of the ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, pp. 1135–1144. Association for Computing Machinery, New
York (2016). https://doi.org/10.1145/2939672.2939778
NFR Orienting the Development of Socially Responsible Software 341
4. Chaudhuri, A., Holbrook, M.B.: The chain of effects from brand trust and brand affect to
brand performance: the role of brand loyalty. J. Mark. 65, 81–93 (2001). https://doi.org/10.
1509/jmkg.65.2.81.18255
5. Park, E., Kim, K.J., Kwon, S.J.: Corporate social responsibility as a determinant of consumer
loyalty: an examination of ethical standard, satisfaction, and trust. J. Bus. Res. 76, 8–13
(2017). https://doi.org/10.1016/J.JBUSRES.2017.02.017
6. Kang, J., Hustvedt, G.: Building Trust Between Consumers and Corporations: The Role of
Consumer Perceptions of Transparency and Social Responsibility. https://doi.org/10.1007/
s10551-013-1916-7
7. Cui, J., Jo, H., Na, H.: Does corporate social responsibility affect information asymmetry? J.
Bus. Ethics 148, 549–572 (2018). https://doi.org/10.1007/s10551-015-3003-8
8. Agrawal, A., et al.: Cloudy with high chance of DBMS: a 10-year prediction for Enterprise-
Grade ML (2019)
9. Delgado-Ballester, E., Munuera-Aleman, J.L., Yague-Guillen, M.J.: Development and
validation of a brand trust scale. Int. J. Mark. Res. 45, 35–56 (2003)
10. Vlachos, P.A., Tsamakos, A., Vrechopoulos, A.P., Avramidis, P.K.: Corporate social respon-
sibility: attributions, loyalty, and the mediating role of trust. J. Acad. Mark. Sci. 37, 170–180
(2009). https://doi.org/10.1007/s11747-008-0117-x
11. Bowen, J.: The ethics of safety-critical systems. Commun. ACM 43, 91–97 (2000). https://
doi.org/10.1145/332051.332078
12. Pivato, S., Misani, N., Tencati, A.: The impact of corporate social responsibility on consumer
trust: the case of organic food. Bus. Ethics A Eur. Rev. 17, 3–12 (2007). https://doi.org/10.
1111/j.1467-8608.2008.00515.x
13. ISO - ISO 26000 Social responsibility. https://www.iso.org/iso-26000-social-responsibility.
html. Accessed 22 Oct 2019
14. Bews, N.F., Rossouw, G.J.: A role for business ethics in facilitating trustworthiness. J. Bus.
Ethics 39, 377–390 (2002). https://doi.org/10.1023/A:1019700704414
15. Lin, P.: Why ethics matters for autonomous cars. In: Maurer, M., Gerdes, J., Lenz, B., Winner,
H. (eds.) Autonomes Fahren, pp. 69–85. Springer, Heidelberg (2015). https://doi.org/10.1007/
978-3-662-45854-9_4
16. Buck, C., Stadler, F., Suckau, K., Eymann, T.: Privacy as a part of the preference structure of
users app buying decision. In: Proceedings of the Wirtschaftsinformatik 2017 (2017)
17. Thierer, A.D.: The internet of things and wearable technology: addressing privacy and secu-
rity concerns without derailing innovation. SSRN 21 (2014). https://doi.org/10.2139/ssrn.
2494382
18. Huang, F., Wang, Y., Wang, Y., Zong, P.: What software quality characteristics most concern
safety-critical domains? In: 2018 IEEE International Conference on Software Quality, Relia-
bility and Security Companion (QRS-C), pp. 635–636. IEEE (2018). https://doi.org/10.1109/
QRS-C.2018.00111
19. Wilson, C., Hargreaves, T., Hauxwell-Baldwin, R.: Benefits and risks of smart home
technologies. Energy Policy 103, 72–83 (2017). https://doi.org/10.1016/J.ENPOL.2016.
12.047
20. Veleda, R., Cysneiros, L.M.: Towards an ontology-based approach for eliciting possible solu-
tions to non-functional requirements. In: Giorgini, P., Weber, B. (eds.) CAiSE 2019. LNCS,
vol. 11483, pp. 145–161. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21290-
2_10
21. Chung, L., Nixon, B.A., Yu, E., Mylopoulos, J.: Non-Functional Requirements in Software
Engineering. Springer, Boston (1999). https://doi.org/10.1007/978-1-4615-5269-7
22. Zinovatna, O., Cysneiros, L.M.: Reusing knowledge on delivering privacy and transparency
together. In: 2015 IEEE Fifth International Workshop on Requirements Patterns (RePa),
pp. 17–24 (2015). https://doi.org/10.1109/RePa.2015.7407733
342 L. M. Cysneiros and J. C. S. do Prado Leite
23. de Gramatica, M., Labunets, K., Massacci, F., Paci, F., Tedeschi, A.: The role of catalogues
of threats and security controls in security risk assessment: an empirical study with ATM
professionals. In: Fricker, S., Schneider, K. (eds.) REFSQ 2015. LNCS, vol. 9013, pp. 98–114.
Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16101-3_7
24. Cardoso, E., Almeida, J.P., Guizzardi, R.S., Guizzardi, G.: A method for eliciting goals for
business process models based on non-functional requirements catalogues. In: Frameworks
for Developing Efficient Information Systems: Models, Theory, and Practice: Models, Theory,
and Practice, pp. 226–242 (2013)
25. Bachmann, R., Gillespie, N., Priem, R.: Repairing trust in organizations and institutions:
toward a conceptual framework. Organ. Stud. 36, 1123–1142 (2015). https://doi.org/10.1177/
0170840615599334
26. Cysneiros, L.M., Yu, E.: Non-functional requirements elicitation. In: do Prado Leite, J.C.S.,
Doorn, J.H. (eds.) Perspectives on Software Requirements. SECS, vol. 753, pp. 115–138.
Springer, Boston (2004). https://doi.org/10.1007/978-1-4615-0465-8_6
27. Cysneiros, L.M., Raffi, M., Sampaio do Prado Leite, J.C.: Software transparency as a key
requirement for self-driving cars. In: 2018 IEEE 26th International Requirements Engineering
Conference (RE), pp. 382–387. IEEE (2018). https://doi.org/10.1109/RE.2018.00-21
28. do Prado Leite, J.C.S., Cappelli, C.: Software transparency. Bus. Inf. Syst. Eng. 2, 127–139
(2010). https://doi.org/10.1007/s12599-010-0102-z
29. Angela Ferrario, M., Simm, W., Forshaw, S., Gradinar, A., Tavares Smith, M., Smith, I.:
Values-first SE: research principles in practice. https://doi.org/10.1145/2889160.2889219
30. Shim, K., Yang, S.: The effect of bad reputation: the occurrence of crisis, corporate social
responsibility, and perceptions of hypocrisy and attitudes toward a company. Public Relat.
Rev. 42(1), 68–78 (2016)
31. Salnitri, M., Angelopoulos, K., Pavlidis, M., et al.: Modelling the interplay of security, privacy
and trust in sociotechnical systems: a computer-aided design approach. Softw. Syst. Model.
19, 467–491 (2020). https://doi.org/10.1007/s10270-019-00744-x
Domain-Specific Modeling (EMMSAD
2020)
A Journey to BSO: Evaluating Earlier and
More Recent Ideas of Mario Bunge
as a Foundation for Information Systems
and Software Development
Roman Lukyanenko(B)
1 Introduction
With the increased reliance on information technology (IT), grows the importance of
building IT based on solid foundations [1, 2]. Historically, one of the most prolific and
effective references for IT analysis, design and development has been ontology. In this
paper we focus on general ontology – a branch of philosophy which studies what exists
in reality (and what reality is) [3, 4] – rather than a domain ontology – a description
(often formal) of constructs in a particular domain (e.g., ontology of Software Defects,
Errors and Failures, see [5] or ontology of research validity [6]) [7].
A general (also known as foundational or upper level) ontology offers IT develop-
ment theoretically grounded (i.e., based on established knowledge from other disciplines
as psychology or physics), consistent, formalized and rigorous meaning for the basic
notions of what exists in reality and thus in a domain of IT. As such, ontological studies are
© Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 345–358, 2020.
https://doi.org/10.1007/978-3-030-49418-6_24
346 R. Lukyanenko
Briefly, following philosophy of Bunge, BWW [24, 25] argues that the world is
made of things – substantial individuals – which possess properties; things may compose
forming composite things, interact with one another, leading to acquisition of new or loss
of existing properties, resulting in events; sets of things from systems. Properties are not
directly accessible to human observers, resulting in the notion of attributes; attributes
which humans ascribe to things may or may not be accurate or complete representations
of the underlying properties. In sum, the key constructs from Bunge which have been
adopted into BWW are: thing, property, attributes, functional schema, state, law, state
space, event, history, coupling, system, class, kind, and their derivatives (e.g., lawful
state space) i.e., see Table 1, p. 222 [33] and [34].
The BWW ontology and the theories, models and methods derived from it, have
been used widely in conceptual, empirical and design work in information systems,
conceptual modeling, software engineering and other areas [26], making it among the
most important developments in the area of ontology in disciplines of IT [35].
Despite the prolific use of BWW (for most recent reviews, see [35, 36]) the ontol-
ogy has been criticized for its narrow physicalist focus, lack of attention to social and
psychological phenomena, as well as postulates which may be problematic for mod-
eling certain type of domain rules (e.g., BWW proscribed optional properties, denied
independent existence of properties, and properties of properties) [4, 21, 28, 37].
Despite the many debates centered on BWW, a generally overlooked issue is that the
original ontology is based on select references from Bunge. Although there have some
attempts at expanding BWW to incorporate other ideas of Bunge [32], these were still
narrow in scope and did not see widespread adoption compared with BWW.
The basis for BWW have been two, albeit seminal and focused on ontology,
manuscripts by Bunge. However, as Bunge frequently noted, ontology is inseparable
from other beliefs, such as on how to obtain knowledge in the world [38]. Indeed,
the Treatise contained many additional beliefs, including on matters of semantics,
epistemology, methodology, ethics, technology, among others.
Additionally, in the over 40 years since the publication of the 1977 and 79 volumes
(and even since the last book of the Treatise - on ethics [39]), Bunge published over 400
manuscripts, in which his ideas were further expanded, refined, and sometimes, altered1 .
Some of these more recent ideas were of great potential relevance to IT, as they directly
dealt with issues of information technology e.g., [41].
Considering the broad and profound impact BWW had on the disciplines of IT, such
as conceptual modeling, an important question to ask is: Can we further IT research and
practice by incorporating these, more recent views and beliefs of Mario Bunge? In order
to answer this question, it is first incumbent to assess the extent to which the original
basis for BWW and the more recent thinking agree and diverge.
2 For example, although Bunge has made a stronger emphasis toward systems, his recent writing
is still rich in references to things, including in the same texts where he talks about systems
being primary existents and preferable to notion of things (e.g., [45, p. 174]).
3 https://scholar.google.com/citations?user=7MmcYgEAAAAJ&hl=en&oi=ao.
4 For example, whereas Bunge describe systems in [44, p. 270] (among many other sources),
to find more detailed discussion of properties of systems, one can consult, for example, [38,
pp. 10–19].
A Journey to BSO: Evaluating Earlier and More Recent Ideas of Mario Bunge 349
(BSO) which as we argue later is a new ontology. Bunge uses multiple labels to describe
his set of beliefs (e.g., “emergentist materialism” [47], “hylorealism” [38, p. 27]), but
the most frequently used appears to be “systemism” [30, 46, 48].
The word ‘system’ is more neutral than ‘thing’, which in most cases denotes a
system endowed with mass an perhaps tactually perceptible; we find it natural to
speak of a force or field as a system, but we would be reluctant to call it a thing. By
calling all existents “concrete systems” we tacitly commit ourselves in tune with a
growing suspicion in all scientific quarters - that there are no simple, structureless
entities.
Systemism doesn’t suggest things no longer exist, but for Bunge it appears more
productive to think about the basic elements of the world as systems, rather than things.
Yet, it is notable that he has not fully committed himself to this thinking as he admits a
possibility of atomic things (i.e., “non-systems”) [48, p. 148]:
Only particle physicists study non-systems, such as quarks, electrons, and photons.
But they know that all such simple things are parts of systems or will eventually
be absorbed by some system.
Yet, as conceptual modeling and many other areas of IT do not engage with quarks,
electrons, and protons (perhaps the progress in quantum computing may change this in
the future), effectively these disciplines may disregard the caveat in [48, p. 148] and
treat all existents of interest as systems. Thus, we can conclude that in BSO, the world
is made of systems.
350 R. Lukyanenko
Bunge believes, systems are always composed of components or parts [46, p. 23].
However, it is not clear what the construct of “part” or “component” mean – we have not
seen their definition in Bunge’s writings. A way to avoid this problem is to recognize,
one again, that in the domains of interest to IT, parts or components of systems are
systems themselves. This is an important realization, as it liberates the field of IT from
the need to resolve the fundamental ontological status of the “component” or “system
part”.
Over the years, Bunge developed and expanded his ontology of systems. Thus, in
the Treatise, Bunge postulated that any system should have “a definite composition, a
different environment, and a different structure. The composition of the system is the
set of its components; the environment, the set of items with which it is connected;
and the structure, the relations among its components as well as among these and the
environment” [30, p. 4].
In later writings, this initial idea was developed into a CESM model, which in addition
to the composition, environment, and structure (present in BWW), added “mechanism”
[48]. Mechanism is defined as “characteristic processes, that make [the system] what
it is and the peculiar ways it changes” [38, p. 126]. To illustrate, Bunge provides an
example of a traditional nuclear family [38, p. 127]:
Its components are the parents and the children; the relevant environment is the
immediate physical environment, the neighbourhood, and the workplace; the struc-
ture is made up of such biological and psychological bonds as love, sharing, and
relations with others; and the mechanism consists essentially of domestic chores,
marital encounters of various kinds, and child rearing. If the central mechanism
breaks down, so does the system as a whole.
Adopting BSO in the context of IT allows to potentially remove the notion of a thing
from the ontology, simply replacing it with the system construct. The inversion of the
relationship between things and systems, and the potential obviation of the need for
A Journey to BSO: Evaluating Earlier and More Recent Ideas of Mario Bunge 351
things in BSO, represents a major change, as the construct of thing has been a founding
one for BWW and has been the conceptual foundation for many studies which adopted
BWW [19, 51].
However, as effectively, things in the social and technical levels of early Bunge were
effectively systems [30], this change can be easily accommodated by much of prior work
which used BWW with a mere replacement of a label.
As systems replace things, this results in a reduction of complexity of BWW ontology
that dealt with the relationship between things and systems (e.g., obviating the need for
constructs such as “composite thing” or “properties of things”). For example, BWW
defines internal event as “an event that arises in a thing, subsystem, or system” [33,
p. 222]. This definition can simply state that an internal event is an event that arises in
a system. However, note, an ontology that uses systems as its fundamental ontological
primitive, would probably borrow other constructs related to systems, which are beyond
the BWW (e.g., CESM).
As in BWW, BSO continues to uphold the beliefs about the relationship between
systems and properties. Systems have properties. Properties do not exist outside of
systems [45, p. 175]: “Property-less entities would be unknowable, hence the hypothesis
of their existence is untestable; and disembodied properties and relations are unknown.”
As in BWW, properties according to BSO do not exist in themselves: “However, … can
be material only derivatively, that is, by virtue of the materiality of the things involved:
there are neither properties nor relations in themselves, except by abstraction.” [38,
p. 11].
Notions of classes and kinds are used in BSO, but somewhat differently compared
with BWW. In BWW, classes are sets of things sharing “a common property”, whereas
kinds are sets of things which share “two or more” property [33, p. 223]). Systems with
“one of more” common properties in BSO [44, p. 111], form classes and those with
properties which are interrelated, form kinds [38, p. 13].
The greater emphasis on systems carries other implications, as this new postulate
is propagated throughout most of Bunge’s recent beliefs. According to BSO, some but
not all (an important caveat cf. BWW) systems undergo change, resulting in emergence
(addition of new) or submergence (loss of old) of properties. To account for this, BSO
continues to use the construct of state. Bunge [45, p. 171] defines a state as “the list of
the properties of the thing at that time” – a definition nearly identical to that in BWW
[29, p. 125]. A state can describe multiple properties (at the same moment in time) [38].
A given system has the properties of its subsystems, as well as its own, termed emergent
properties (and idea unchanged since BWW), but now gaining greater focus in BSO, as
a key implication of systemism.
Whereas per BWW, Bunge applies the notion of a state to all things [29, p. 123],
in BSO, Bunge [38] makes an important distinction between systems which undergo
change and those do not. In BWW, a set of postulates deal with changes of states (i.e.,
events) and how the properties which make up the states are perceived by humans (i.e.,
attributes) [29]. However, for BSO these constructs do not apply to all systems.
Bunge distinguishes two kinds of system: conceptual and concrete [44, p. 270].
A conceptual (or formal) system is a system all the components of which are concep-
tual (e.g., propositions, classifications, and hypothetico-deductive systems-i.e., theories).
352 R. Lukyanenko
This is contrasted with concrete (or material) systems which are made of concrete com-
ponents (i.e., subsystems, such as atoms, organisms, and societies), meaning that these
components may undergo change.5
What distinguishes concrete and conceptual systems is the essential property of
mutability – a key element of BSO - which only concrete systems possess: “mutability
is the one property shared by all concrete things, whether natural or artificial, physical
or chemical, biological or social, perceptible or imperceptible” [38, p. 10]. Bunge thus
explains that changes in systems may only occur if the systems are concrete [38, p. 11]:
heat propagation, metabolism, and ideation qualify as material since they are pro-
cesses in material things. By contrast, logical consistency, commutativity, and
differentiability can only be predicated of mathematical objects.
Concrete systems change in the virtue of energy transfer. For Bunge, “the technical
word for ‘changeability’ is energy” [38, p. 12], such that:
To repeat, energy is not just a property among many. Energy is the universal
property, the universal par excellence.
Multiple events form processes (a new construct for BWW): defined as “a sequence,
ordered in time, of events and such that every member of the sequence takes part in the
determination of the succeeding member” [45, p. 172].
5 Bunge [44, p. 270] also distinguishes a symbolic (or semiotic) system as a type of a concrete
system some components of which stand for or represent other objects (e.g., languages, computer
diskettes, and diagrams).
6 This may potentially resolve the criticisms levied against Bunge’s ontology by as being too
physicalist [21, 28] - original ideas of Bunge captured in BWW without explicit qualification
have indeed been casted by BSO as belonging to only material reality.
A Journey to BSO: Evaluating Earlier and More Recent Ideas of Mario Bunge 353
pragmatic benefit of realism is that it encourages thinking and action beyond sensations
and encourages an active stance toward reality.
There is no “end” of the BSO per se (recall, BSO is not published in a self-contained
treatise), as Bunge continuously stresses the interdependency between ontology and
other beliefs, thus we draw a demarcation based on constructs in BWW [33]. We note,
Wand and Weber engaged with other ideas of Bunge, as did other scholars e.g., [32,
54], and acknowledged the existence of other constructs and more recent set of beliefs.
As they note, Bunge “has written extensively about social phenomena using constructs
based upon his ontology (e.g., Bunge, 1998)” [26, p. 6]. Yet, much of IT community
adopted the views of Bunge stemming from BWW, making this an important benchmark
comparison.
6 Conclusion
Mario Bunge made a profound mark on the field of conceptual modeling, software
engineering, information quality, database design. Much of this influence has been via
BWW ontology – an incredibly valuable body of knowledge which popularized Bunge
in the field of IT and became the foundation for numerous studies on design and use of
information technologies. Even researchers who disagreed with aspects of BWW and
Bunge’s ontology, benefited from these ideas greatly, as BWW provided a key benchmark
and inspired to pursue ontological studies in IT [4, 20, 21].
The significance and success of BWW motivated us to seek new ways Bunge’s
extensive thinking can be leveraged in the design and use of IT. As we showed in our
work, BSO contains ideas that although somewhat compatible with BWW, are also quite
different, raising new prospects and opening new possibilities.
The new BSO emerges as a complex and extensive set of beliefs. In this paper, we
began to expose its basic tenets and assumptions. However, this work is by no means
complete. Our key objective was to establish BSO as a new ontology. Much work remains
to study BSO in its own right (including its benefits and limitations for applications in
IT), formalizing it into a finite set of postulates (as Wand and Weber did for BWW), and
seeking out areas of IT practice which could benefit from the application of these ideas.
In short, we call on researchers to consider adopting BSO as a promising new ontology.
356 R. Lukyanenko
References
1. Guerreiro, S., van Kervel, S.J., Babkin, E.: Towards devising an architectural framework for
enterprise operating systems. In: ICSOFT, pp. 578–585 (2013)
2. Henderson-Sellers, B.: Why philosophize; why not just model? In: Johannesson, P., Lee,
M.L., Liddle, S.W., Opdahl, A.L., López, Ó.P. (eds.) ER 2015. LNCS, vol. 9381, pp. 3–17.
Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25264-3_1
3. Gonzalez-Perez, C.: How ontologies can help in software engineering. In: Cunha, J., Fer-
nandes, J.P., Lämmel, R., Saraiva, J., Zaytsev, V. (eds.) GTTSE 2015. LNCS, vol. 10223,
pp. 26–44. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60074-1_2
4. Guizzardi, G.: Ontological foundations for structural conceptual models. Telematics Instituut
Fundamental Research Series, Enschede, The Netherlands (2005)
5. Duarte, B.B., Falbo, R.A., Guizzardi, G., Guizzardi, R.S.S., Souza, V.E.S.: Towards an ontol-
ogy of software defects, errors and failures. In: Trujillo, J.C., et al. (eds.) ER 2018. LNCS,
vol. 11157, pp. 349–362. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00847-
5_25
6. Lukyanenko, R., Larsen, K.R., Parsons, J., Gefen, D., Mueller, R.M.: Toward creating a
general ontology for research validity. In: International Conference on Conceptual Modeling,
Salvador, Brazil, pp. 133–137 (2019)
7. McDaniel, M., Storey, V.C.: Evaluating domain ontologies: clarification, classification, and
challenges. ACM Comput. Surv. 53(1), 1–40 (2019)
8. Verdonck, M., Gailly, F., Pergl, R., Guizzardi, G., Martins, B., Pastor, O.: Comparing tradi-
tional conceptual modeling with ontology-driven conceptual modeling: an empirical study.
Inf. Syst. 81, 92–103 (2019)
9. Recker, J., Rosemann, M., Krogstie, J.: Ontology-versus pattern-based evaluation of process
modeling languages: a comparison. Commun. Assoc. Inf. Syst. 20(1), 48 (2007)
10. Martínez Ferrandis, A.M., Pastor López, O., Guizzardi, G.: Applying the principles of an
ontology-based approach to a conceptual schema of human genome. In: Ng, W., Storey, V.C.,
Trujillo, J.C. (eds.) ER 2013. LNCS, vol. 8217, pp. 471–478. Springer, Heidelberg (2013).
https://doi.org/10.1007/978-3-642-41924-9_40
11. Pastor, Ó., España, S., González, A.: An ontological-based approach to analyze software
production methods. In: Kaschek, R., Kop, C., Steinberger, C., Fliedl, G. (eds.) UNISCON
2008. LNBIP, vol. 5, pp. 258–270. Springer, Heidelberg (2008). https://doi.org/10.1007/978-
3-540-78942-0_26
12. Wand, Y., Wang, R.Y.: Anchoring data quality dimensions in ontological foundations.
Commun. ACM 39(11), 86–95 (1996)
13. Reinhartz-Berger, I., Itzik, N., Wand, Y.: Analyzing variability of software product lines using
semantic and ontological considerations. In: Jarke, M., et al. (eds.) CAiSE 2014. LNCS, vol.
8484, pp. 150–164. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07881-6_11
14. Guarino, N.: Formal ontology, conceptual analysis and knowledge representation. Int. J. Hum.
Comput. Stud. 43(5–6), 625–640 (1995)
15. Bodart, F., Patel, A., Sim, M., Weber, R.: Should optional properties be used in conceptual
modelling? A theory and three empirical tests. Inf. Syst. Res. 12(4), 384–405 (2001)
16. Burton-Jones, A., Weber, R.: Building conceptual modeling on the foundation of ontology. In:
Computing Handbook: Information Systems and Information Technology, pp. 15.1–15.24.
CRC Press, Boca Raton (2014)
17. Bera, P., Burton-Jones, A., Wand, Y.: Research note—how semantics and pragmatics interact
in understanding conceptual models. Inf. Syst. Res. 25(2), 401–419 (2014)
18. Recker, J., Rosemann, M., Green, P., Indulska, M.: Do ontological deficiencies in modeling
grammars matter? MIS Q. 35(1), 57–79 (2011)
A Journey to BSO: Evaluating Earlier and More Recent Ideas of Mario Bunge 357
19. Lukyanenko, R., Parsons, J., Wiersma, Y.: The IQ of the crowd: understanding and improving
information quality in structured user-generated content. Inf. Syst. Res. 25(4), 669–689 (2014)
20. Guizzardi, G., Wagner, G., Almeida, J.P.A., Guizzardi, R.S.: Towards ontological foundations
for conceptual modeling: the unified foundational ontology (UFO) story. Appl. Ontol. 10(3–4),
259–271 (2015)
21. March, S.T., Allen, G.N.: Toward a social ontology for conceptual modeling. In: Communi-
cations of the AIS, vol. 34 (2014)
22. Herre, H.: General formal ontology (GFO): a foundational ontology for conceptual modelling.
In: Poli, R., Healy, M., Kameas, A. (eds.) Theory and Applications of Ontology: Computer
Applications, pp. 297–345. Springer, Heidelberg (2010). https://doi.org/10.1007/978-90-481-
8847-5_14
23. Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., Schneider, L.: Sweetening ontologies
with DOLCE. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol.
2473, pp. 166–181. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45810-7_18
24. Wand, Y., Weber, R.: Toward a theory of the deep structure of information systems. In:
International Conference on Information Systems, Copenhagen, Denmark, pp. 61–71 (1990)
25. Wand, Y., Weber, R.: An ontological analysis of some fundamental information systems
concepts. In: Proceedings of the Ninth International Conference on Information Systems,
vol. 1988, pp. 213–226 (1988)
26. Wand, Y., Weber, R.: Thirty years later: some reflections on ontological analysis in conceptual
modeling. J. Database Manag. (JDM) 28(1), 1–17 (2017)
27. Burton-Jones, A., Recker, J., Indulska, M., Green, P., Weber, R.: Assessing representation
theory with a framework for pursuing success and failure. MIS Q. 41(4), 1307–1333 (2017)
28. Wyssusek, B.: On ontological foundations of conceptual modelling. Scand. J. Inf. Syst. 18(1),
63–80 (2006)
29. Bunge, M.A.: Treatise on Basic Philosophy: Ontology I: The Furniture of The World. Reidel,
Boston (1977)
30. Bunge, M.A.: Treatise on Basic Philosophy: Ontology II: A World of Systems. Reidel
Publishing Company, Boston (1979)
31. Bunge, M.A., et al.: Mario Bunge: A Centenary Festschrift. Springer, Cham (2019). https://
doi.org/10.1007/978-3-030-16673-1
32. Rosemann, M., Wyssusek, B.: Enhancing the expressiveness of the Bunge-Wand-Weber
ontology. In: AMCIS 2005 Proceedings, pp. 1–8 (2005)
33. Wand, Y., Weber, R.: On the ontological expressiveness of information systems analysis and
design grammars. Inf. Syst. J. 3(4), 217–237 (1993)
34. Wand, Y., Weber, R.: Mario Bunge’s ontology as a formal foundation for information systems
concepts. In: Weingartner, P., Dorn, G. (eds.) Rodopi, pp. 123–150 (1990)
35. Jabbari, M., Lukyanenko, R., Recker, J., Samuel, B., Castellanos, A.: Conceptual modeling
research: revisiting and updating Wand and Weber’s 2002 research agenda. In: AIS SIGSAND,
pp. 1–12 (2018)
36. Saghafi, A., Wand, Y.: Conceptual models? A meta-analysis of empirical work. In: Hawaii
International Conference on System Sciences, Big Island, HI, pp. 1–15 (2014)
37. Veres, C., Mansson, G.: Psychological foundations for concept modeling. In: Blackwell, A.F.,
Marriott, K., Shimojima, A. (eds.) Diagrams 2004. LNCS (LNAI), vol. 2980, pp. 26–28.
Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25931-2_5
38. Bunge, M.A.: Chasing reality: strife over realism. University of Toronto Press, Toronto (2006)
39. Bunge, M.A.: Treatise on Basic Philosophy: Ethics: The Good and The Right. Springer,
Amsterdam (1989). https://doi.org/10.1007/978-94-009-2601-1
40. Bunge, M.A.: Between Two Worlds: Memoirs of a Philosopher-Scientist. Springer, Heidel-
berg (2016). https://doi.org/10.1007/978-3-319-29251-9
358 R. Lukyanenko
41. Bunge, M.A.: The dark side of technological progress. In: Sassower, R., Laor, N. (eds.) The
Impact of Critical Rationalism, pp. 109–113. Springer, Cham (2019). https://doi.org/10.1007/
978-3-319-90826-7_10
42. Bunge, M.A.: Treatise on Basic Philosophy: Semantics I: Sense and Reference. Springer,
Amsterdam (1974). https://doi.org/10.1007/978-94-010-9920-2
43. Bunge, M.A.: Treatise on Basic Philosophy: Volume 6: Epistemology & Methodology II:
Understanding the World. Reidel, Boston (1983)
44. Bunge, M.A.: Finding Philosophy in Social Science. Yale University Press, New Haven (1996)
45. Bunge, M.A.: Philosophy of Science: Volume 2, From Explanation to Justification. Routledge,
New York (2017)
46. Bunge, M.A.: Systems everywhere. In: Cybernetics and Applied Systems, pp. 23–41. CRC
Press, London (2018)
47. Bunge, M.A.: Emergence and Convergence: Qualitative Novelty and the Unity of Knowledge.
University of Toronto Press, Toronto (2003)
48. Bunge, M.A.: Systemism: the alternative to individualism and holism. J. Soc. Econ. 2(29),
147–157 (2000)
49. Bunge, M.A.: Gravitational waves and spacetime. Found. Sci. 23(2), 399–403 (2018). https://
doi.org/10.1007/s10699-017-9526-y
50. Agazzi, E.: Systemic thinking. In: Matthews, M.R. (ed.) Mario Bunge: A Centenary
Festschrift, pp. 219–240. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16673-
1_13
51. Parsons, J., Wand, Y.: Emancipating instances from the tyranny of classes in information
modeling. ACM Trans. Database Syst. 25(2), 228–268 (2000)
52. Hirst, R.J.: The Problems of Perception. Routledge, London (2002)
53. Hempel, C.G.: Philosophy of Natural Science. Pearson, London (1966)
54. Milton, S.K.: Ontological foundations of representational information systems. Scand. J. Inf.
Syst. 19(1), 5 (2007)
55. Bodart, F., Weber, R.: Optional properties versus subtyping in conceptual modeling: a theory
and empirical test. In: International Conference on Information Systems (1996)
56. Burton-Jones, A., Weber, R.: Properties do not have properties: investigating a questionable
conceptual modeling practice. In: Annual Symposium on Research in Systems Analysis and
Design (2003)
57. Gemino, A., Wand, Y.: Complexity and clarity in conceptual modeling: comparison of
mandatory and optional properties. Data Knowl. Eng. 55(3), 301–326 (2005)
58. Lukyanenko, R., Parsons, J., Wiersma, Y.F., Wachinger, G., Huber, B., Meldt, R.: Representing
crowd knowledge: guidelines for conceptual modeling of user-generated content. J. Assoc.
Inf. Syst. 18(4), 297–339 (2017)
59. Lukyanenko, R., Parsons, J., Samuel, B.M.: Representing instances: the case for reengineering
conceptual modeling grammars. Eur. J. Inf. Syst. 28(1), 68–90 (2019). https://doi.org/10.1080/
0960085X.2018.1488567
60. Samuel, B.M., Khatri, V., Ramesh, V.: Exploring the effects of extensional versus intentional
representations on domain understanding. MIS Q. 42(4), 1187–1209 (2018)
A New DEMO Modelling Tool that Facilitates
Model Transformations
Abstract. The age of digitization requires rapid design and re-design of enter-
prises. Rapid changes can be realized using conceptual modelling. The design and
engineering methodology for organizations (DEMO) is an established modelling
method for representing the organization domain of an enterprise. However, het-
erogeneity in enterprise design stakeholders generally demand for transformations
between conceptual modelling languages. Specifically, in the case of DEMO, a
transformation into business process modelling and notation (BPMN) models is
desirable to account to both, the semantic sound foundation of the DEMO models,
and the wide adoption of the de-facto industry standard BPMN. Model transforma-
tion can only be efficiently applied if tool support is available. Our research starts
with a state-of-the-art analysis, comparing existing DEMO modelling tools. Using
a design science research approach, our main contribution is the development of
a DEMO modelling tool on the ADOxx platform. One of the main features of our
tool is that it addresses stakeholder heterogeneity by enabling transformation of
a DEMO organization construction diagram (OCD) into a BPMN collaboration
diagram. A demonstration case shows the feasibility of our newly developed tool.
1 Introduction
The age of digitization requires rapid design and re-design of enterprises. In addition,
the agile design paradigm embraces the use of multiple modelling languages to represent
design knowledge. Unfortunately, this paradigm also has challenges regarding incon-
sistencies between model types that represent knowledge from the same knowledge
domain. Modelling researchers should ensure to create models, languages, and methods
that can be adapted to changing requirements in the future [1, p. 3].
Domain-specific languages are created to provide insight and understanding within
a particular domain context and stakeholder group [2]. As an example, the design and
engineering methodology for organizations (DEMO) provides models that represent the
organization domain of an enterprise [3]. DEMO offers a unique design perspective, since
© Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 359–374, 2020.
https://doi.org/10.1007/978-3-030-49418-6_25
360 T. Gray et al.
its four aspect models have the ability to represent organization design domain knowl-
edge in a concise and consistent way, removing technological realization and implemen-
tation details [3]. One of DEMO’s aspect models, the construction model, incorporates
an organization construction diagram (OCD) that provides a concise representation of
enterprise operations. Managers value the OCD, since it becomes a blueprint that enables
discussions on enterprise (re-)design and strategic alignment [3, 4]. Recker et al. [5] and
Van Nuffel et al. [6] indicated that unguided use of the Business Process Modeling Nota-
tion (BPMN) constructs often leads to inconsistent models. It is thus our goal to combine
the strengths of DEMO and BPMN by proposing a model transformation and modelling
tool support.
Due to its characteristics of being consistent and concise, various authors experi-
mented with transformations between modelling languages, as discussed in the remain-
ing paragraph. De Kinderen, Gaaloul and Proper [7] indicated that “ArchiMate lacks
specificity on how to model different perspectives in-depth” while [8, 9] add that Archi-
Mate lacks in expressing value exchange. As a solution to these deficiencies, [7] con-
ducted a study to map concepts from DEMO to concepts contained within the busi-
ness layer of the ArchiMate meta-model with the purpose of modelling the essential
aspects of an enterprise first in DEMO, followed by a transformation into an ArchiMate
model, adding technological realization and implementation details. Based on the work
of Ceatano et al. [10] and Heller [11], Mraz et al. [12] presented transformation specifi-
cations to generate BPMN models from DEMO models. Yet, the specifications did not
consider the complexity of hierarchical structures in DEMO models. In addition, their
transformation specifications were not supported by tooling to automate DEMO-BPMN
transformations.
This study starts with an evaluation of existing DEMO modelling tools. We conclude
that existing modelling tools do not support all of DEMO’s four aspect models. In
addition, the tools do not facilitate transformations to other languages, such as BPMN.
The main objective of this article is to address stakeholder heterogeneity by developing a
DEMO modelling tool on the ADOxx platform. We demonstrate one of the main features
of our tool, namely to transform a DEMO organization construction diagram (OCD) into
a corresponding BPMN collaboration diagram.
The article is structured as follows. Section 2 provides background on multi-view
modelling, as well as the existing knowledge on DEMO concepts that are explained via a
demonstration case. Using design science research, as presented in Sect. 3, we present the
requirements for a new DEMO tool in Sect. 4 and the DEMO constructional components
that form part of the OMiLAB ecosystem, in Sect. 5. We also demonstrate the key
functionality of the new DEMO tool, i.e. semi-automatic OCD-BPMN transformations
for one out of four identified transformation scenarios. Section 6 ends with conclusions
and suggestions for future research.
2 Background
Bork [15] emphasised the need to develop consistent and concise conceptual models
for domain-specific languages. Prior to developing tool support and model transforma-
tion, language specifications should at least consider to provide syntax, semantics, and
notation for the different viewpoints [16].
Mulder [17] also acknowledged the need to validate the existing DEMO specification
language (DEMOSL) prior to developing tool support. Using the meta-model definition
presented by [18], metamodels should be sufficiently complete to describe all set of
models (i.e. multiple viewpoints) that are allowed, rejecting models that are not valid.
In addition, the metamodel should enable partial transformation of the model (e.g. from
ontological to implementation level). With respect to the DEMO metamodels, Mulder
[17] already suggested improvements regarding the multiple viewpoints evident in four
aspect models. Since our first version of the DEMO-ADOxx tool only includes the
construction model (CM), we elaborate within the next section on the updated metamodel
for the CM.
Construction Model
Action Model
Fig. 1. DEMO aspect models with diagram types and tables, based on [19] and [20]
The ontological model is based on a key discovery that forms the basis of the aspect
models, namely the identification of a complete transaction pattern that involves two
actor roles, a production act (and fact), and multiple coordination acts (and facts) that
362 T. Gray et al.
are performed in a particular order [19]. Although it is possible to identify three different
sorts of a transaction kind (TK), i.e. original, informational and documental, the four
aspect models primarily focus on the original sort. A TK can also be classified as an
elementary TK when it is executed by only one actor role, or an aggregate TK (ATK)
when it is executed by multiple actor roles. Also, an actor role can be classified as either
an elementary actor role (EAR) when s/he executes one TK and a composite actor role
(CAR) when s/he is the executor of more than one TK [19, 20].
The concepts that were discussed so far, as well as the relationships between con-
cepts, are described via a metamodel presented in [19]. Mulder [17] identified several
inconsistencies with regards to the CM, addressing the issues in [21]. Figure 2 presents
an updated metamodel that incorporates the extensions suggested by Mulder [21]. Note
that the Scope of Interest (SoI) is not modelled as a separate concept, since Mulder [21]
argues that the SoI is equivalent to the CAR. The relationships and cardinalities in Fig. 2
signify modelling constraints when a modeller composes a CM. The constraints should
also be incorporated in the modelling tool. As an example, a single relationship exists
between Transaction Kind (TK) and Aggregate Transaction Kind (ATK) in Fig. 2. The
relationship can be interpreted in a forward direction as: “One TK is contained in zero
or many ATKs”. The relationship interpretation of the reverse direction is: “One ATK
contains one or many TKs”.
0..*
CAR is a part of CAR
Fig. 2. DEMO construction model metamodel Version 3.7 [19] with extensions of [21]
A New DEMO Modelling Tool that Facilitates Model Transformations 363
The reader is referred to [19] for a comprehensive introduction to the OCD and
legend for concepts included in Fig. 2 and Fig. 4. In our demonstrating OCD, portrayed
in Fig. 4, we assume that we only include TKs that are of the original transaction sort,
in accordance with the guidelines presented by Dietz [20] to focus on the essential
TKs. Based on the concepts declared in [19], we use bold style to indicate the type of
construct and italics when referring to an instance of the construct (see Fig. 4).
Scope of Interest (SoI) indicates that the modeler analyses a particular scope of
operations, namely some operations at a college. Given the SoI, Fig. 4 indicates that
three environmental actor roles are defined, see the grey-shaded constructs student,
project sponsor and HR of project sponsor that form part of the environment. Within
the SoI, multiple transaction kinds (TKs) are linked to different types of actor roles
via initiation links or executor links. As an example, supervisor allocation (T01) is
a TK that is initiated (via an initiation link) by the environmental actor role student
(CA01). In accordance with [20], the student (CA01) is by default also regarded to be a
composite actor role “of which one does not know (or want to know) the details”. Since
T01 is linked to an environmental actor role, it is also called a border transaction
kind. T01 is executed (via the executor link) by the elementary actor role named
supervisor allocator (A01).
All the other actor roles in Fig. 4 within the SoI are elementary actor roles, since
each of them is only responsible for executing one transaction kind. A special case of
is where an elementary actor role is both the initiator and executor of a transaction
kind, also called a self-activating actor role. Figure 4 exemplifies the self-activating
actor role with module reviser (A04) and project controller (A05). Since actor roles
need to use facts created and stored in transaction banks, an information link is used
364 T. Gray et al.
to indicate access to facts. As an example, Fig. 4 indicates that project controller (A05)
has an information link to transaction kind module revision (T04), indicating that the
project controller (A05) uses facts in the transaction bank of module revision (T04). It
is also possible that actor roles within the SoI need to use facts that are created via
transaction kinds that are outside the SoI. As an example, Fig. 4 indicates that actor
roles within the SoI (called, some operations at a college) need to use facts that are
created outside the SoI and stored in the transaction banks of aggregate transaction
kinds, namely person facts of AT01, college facts of AT02, accreditation facts of AT03,
timetable facts of AT04 and student enrollment facts of AT05. According to Fig. 4, the
student enrollment facts of aggregate transaction kind AT05 are not accessed by any
actor roles, which should be possible (according to the meta-model depicted in Fig. 2).
Even though Fig. 4 only includes elementary actor roles within the SoI, it is possible
to consolidate elementary actor roles within a composite actor role, where a composite
actor role “is a network of transaction kinds and (elementary) actor roles” [20]. Figure 4
illustrates two composite actor roles within the SoI, namely College (CA0) and Con-
troller (CA01). Both CA00 and CA01 encapsulate a number of transaction kinds and
elementary actor roles.
A New DEMO Modelling Tool that Facilitates Model Transformations 365
3 Research Method
Applying design science research (DSR), we developed the DEMO-ADOxx modelling
tool. According to Gregor & Hevner’s [22] knowledge contribution framework, the
modelling tool can be considered as an improvement, since the tool will be used for
solving a known problem. Referring to the DSR steps of Peffers et al. [23], this article
addresses the five steps of the DSR cycle in the following way:
Identify a Problem: In Sect. 4.1 we present minimal requirements for a useful DEMO
modelling tool. Based on the requirements, we assess in Sect. 4.2 that existing DEMO
modelling tools are inadequate.
Define Objectives of the Solution: In Sects. 4.3 and 4.4 we specify a new DEMO-
ADOxx tool to address the requirements. We highlight that the DEMO-ADOxx tool
only supports one of the four aspect models, namely the CM. Furthermore, the tool only
incorporates two of the three CM representations, namely the OCD and TPT.
Evaluation: Evaluation was restricted to internal testing, using the DEMO-ADOxx tool
to model a more extensive case (than the demonstration case in Sect. 2.2). Individual test
scenarios were created to validate each of the relationships and cardinalities illustrated
in Fig. 2. The study excluded further evaluation, but Sect. 6 provides suggestions on
further evaluating and extending the DEMO-ADOxx tool.
a DEMO tool, structured according to the first two categories. The purpose is to compare
and evaluate the existing DEMO tools in terms of the following minimum requirements
defined from the perspective of a lecturer teaching DEMO:
• R1: The DEMO tool should be comprehensive in supporting all of the DEMO aspect
models, namely the CM, PM, AM and FM (refer to Fig. 1).
• R2: The DEMO tool should support the most recent published language specification,
i.e. DEMOSL 3.7 (see [19]) and the extensions that have been published (see [21]).
The tool should be ready to accommodate future upgrades of the DEMO language.
• R3: The DEMO tool should facilitate model transformations to other modelling
languages such as BPMN.
• R4: The DEMO tool should be available at low cost, especially for educational
purposes.
• R5: The DEMO tool should be usable, i.e. user-friendly.
• U1 Consistency: The system needs to be consistent in its actions, so that the modeller
can get used to the system without constantly having to adapt to a new way of doing
things. Consistency should apply to the way icons and commands are displayed and
used.
• U2 User Control: The system should offer the user control in the way the model is
built and run. This could include cancelling/pausing operations, undoing or redoing
steps. The modeller should be able to foresee or undo errors.
• U3 Ease of learning: The system should be easy to learn for a new modeller. This is
achieved by avoiding icons, layouts and terms that are unfamiliar to the modeller.
• U4 Flexibility: The system is expected to offer different ways to accomplish the same
task so that the user experiences maximum freedom. Examples include shortcut keys,
different icon options or even layout customisation.
• U5 Error Management: The system is expected to have built-in counter-measures to
prevent mistakes by displaying error messages, warning icons or simply preventing
incorrect placement of model elements.
• U6 Reduction of Excess: The system should avoid displaying unnecessary information
or adding unnecessary functionality to the tool. The program should be functional and
easy to understand.
• U7 Visibility of System Status: The user of the system should be aware of the status
of the system at all times. For example, if a command does not occur instantaneously,
then the system should inform the user of the delay.
of preference: (1) experimenting with the tools that were available; (2) contacting the
tool owners for information about their tools; and (3) using the tool evaluation results of
Mulder [26]. During a second phase, we tested the usability (R5) of four tools that were
openly available (see Table 2).
uSoft studio ● ◯ ◯ ◯ ◯ ◯ ?
Visio ● ◯ ◯ ● ◯ ◯ ◐ ? Level of support is unclear
Xemod ● ● ◯ ● ◯ ◯ ◯
Abacus ● ◐ ◯ ◯ ◯ ◯ ◐
DEMO-ADOxx ● ◯ ◯ ◯ ● ● ●
U1 U2 U3 U4 U5 U6 U7
Abacus ● ● ● ● ● ● ●
ModelWorld ● ◯ ● ◯ ● ● ◯
Plena ● ● ◐ ● ● ◐ ●
DEMO-ADOxx ● ● ● ◐ ● ● ●
Our ADOxx tool does not comply with R1, since the initial focus of the tool is to
support the CM. For R2, the ADOxx tool supports DEMOSL 3.7 and the extensions. For
R3 only the ADOxx tool supports transformations from DEMO models to other model
types. Regarding R4, the ADOxx tool is free of cost for education purposes.
Phase 2 Evaluation: We had access to three of the existing DEMO modelling tools
listed in Table 1, namely Abacus, ModelWorld and Plena. Using Nassar’s usability
requirements [25] listed in Sect. 4.1, we evaluated each of the three tools, also adding
the DEMO-ADOxx tool, to gain some insights regarding their usability. The results are
summarised in Table 2, indicating that three of the tools have usability drawbacks:
The purpose of the evaluation was to provide an overview of the existing DEMO
modelling tools to establish whether a new DEMO tool was needed. Even though existing
tools are available, our main concern is that existing tools do not address requirements
R2, R3 and R4. The new DEMO-ADOxx tool has been developed as a main deliverable
for this study to address these three requirements. In terms of R1, the next section
motivates the decision to initially set the scope to the DEMO CM.
A qualitative analysis on DEMO aspect models, indicate that the CM, detailed by the
PM, are useful for assigning responsibilities and duties to individuals [4]. The AM
and FM “are necessary if you are going to develop or select applications” [4]. Since
the conceptual knowledge embedded in the PM is similar to the BPMN collaboration
diagram [12] and BPMN is widely adopted by industry [27, 28], the initial DEMO-
ADOxx tool focuses on the CM. We exclude the PM, since the PM logic can be also
represented by the industry-accepted notation BPMN. Our tool ensures consistent OCD-
derived BPMN collaboration diagrams that incorporate the logic embedded in the DEMO
standard transaction pattern as defined in [19].
A New DEMO Modelling Tool that Facilitates Model Transformations 369
We incorporated recent specifications regarding the OCD and TPT, as stated in [19]
and [21], as well as BPMN 2.0 [29] for the first version of the DEMO-ADOxx tool.
All of the existence rules, shown in Fig. 2 were implemented, except for one. The rule
“facts with fact kind FK are contained in the bank of TK”, indicated on Fig. 2, has not
been incorporated in the DEMO-ADOxx tool, since it relates to the bank contents table
(BCT), and the BCT relates to concepts that are used as part of the FM.
• Scenario 1: Customer-initiated TK with no parts. For this scenario, an actor role that
is outside the scope-of-interest, initiates a TK. Also, the TK does not have any parts,
i.e., the executor of the TK, is not initiating other TKs. Referring to Fig. 4, the TK
labelled T01 (supervisor allocation) is an example of this scenario. T01 is initiated
by the actor role student. The executor of T01 is the supervisor allocator. Yet, the
supervisor allocator does not initiate any other TKs as parts.
• Scenario 2: TK is part of another TK. For this scenario, the selected TK forms part of
another TK. Referring to Fig. 4 the TK labelled T07 (project involvement) is initiated
by an actor role A06 (internal project sponsor). Since the internal project sponsor
is both the executor of T06 (internal project sponsoring) and the initiator of T07
(project involvement), T07 is a part of T06.
• Scenario 3: TK is self-initiating. For this scenario, the selected TK is initiated and
executed by the same actor role. Referring to Fig. 4, the TK labelled T04 (module
revision) is initiated and executed by A04 (module reviser).
• Scenario 4: TK has one or more parts. For this scenario, the selected TK has one or
more parts, i.e. the actor role that executes the TK, is also initiating one or more
other TKs. Referring to Fig. 4 the TK labelled T5 (project control) is executed by
actor role A05 (project controller). The same actor role A05 (project controller) also
initiates multiple other TKs, namely T02 (project sponsoring), T03 (IP clearance),
and T06 (internal project sponsoring).
At the top of the screen are the menu options depicted. We implemented a Model
Analysis menu that provides the option to either generate a TPT such as the one in Fig. 3,
or to validate a model.
The Validation feature implemented each of the existence rules (relationships and
cardinalities) presented in Fig. 2, except for one, as indicated before in Sect. 4.3. Figure 6
illustrates a validation table that communicates to the modeller: (1) The nature of a
mistake in the model; and (2) The model constructs involved.
Based on the demonstration case discussed in Sect. 2.2, we used the new tool to
generate an OCD (see Fig. 4) as well as a TPT (see Fig. 3) by utilizing the implemented
semi-automatic model transformations of the DEMO-ADOxx tool.
Based on the modeller selections illustrated in Fig. 7, the DEMO-ADOxx tool auto-
matically generates the corresponding BPMN collaboration diagram (see Fig. 8). The
BPMN diagram (Fig. 8) presents the initiating actor role (internal project sponsor) as a
BPMN pool and the executing actor role (student) as a BPMN pool. In accordance with
transformation specifications (not detailed in this article), transaction pattern detail for
the standard pattern, is depicted via BPMN concepts.
The meta-model provided a good baseline for the DEMO-ADOxx tool. Yet, we
accept that the meta-model will change in the future and these changes need to be accom-
modated by our tool in future. We still wait for feedback on the OCD-DEMO transforma-
tion specifications that will require further work on the DEMO-ADOxx tool. Realizing
the tool as an open source project within the OMiLAB ensures that a community can
take over future tool enhancements.
The demonstration case was useful in presenting the key features of the new DEMO-
ADOxx tool. In terms of the usability requirements, additional evaluation is required. For
future work, DEMO modellers will be involved during usability tests to inform further
tool enhancements. In addition, a new version of DEMOSL will be released during 2020
and need to be incorporated within the DEMO-ADOxx tool.
References
1. Frank, U., Strecker, S., Fettke, P., Vom Brocke, J., Becker, J., Sinz, E.J.: The research field:
modelling business information systems. Bus. Inf. Syst. Eng. 6(1), 1–5 (2014). https://doi.
org/10.1007/s12599-013-0301-5
2. Karagiannis, D., Mayr, H.C., Mylopoulos, J.: Domain-specific Conceptual Modeling: Con-
cepts, Methods and Tools. Springer, Switzerland (2016). https://doi.org/10.1007/978-3-319-
39417-6
3. Dietz, J.L.G.: Enterprise Ontology. Springer, Berlin (2006). https://doi.org/10.1007/3-540-
33149-2
4. Décosse, C., Molnar, W.A., Proper, H.A.: What does DEMO do? A qualitative analysis about
DEMO in practice: founders, modellers and beneficiaries. In: Aveiro, D., Tribolet, J., Gouveia,
D. (eds.) EEWC 2014. LNBIP, vol. 174, pp. 16–30. Springer, Cham (2014). https://doi.org/
10.1007/978-3-319-06505-2_2
5. Recker, J., Indulska, M., Rosemann, M., Green, P.: How good is BPMN really? Insights
from theory and practice. In: Ljungberg, J., Andersson, M. (eds.) Proceedings 14th European
Conference on Information Systems, ECIS, pp. 1582–1593 (2006)
6. Van Nuffel, D., Mulder, H., Van Kervel, S.: Enhancing the formal foundations of BPMN
by enterprise ontology. In: Albani, A., Barjis, J., Dietz, J.L.G. (eds.) CIAO!/EOMAS -2009.
LNBIP, vol. 34, pp. 115–129. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-
642-01915-9_9
7. de Kinderen, S., Gaaloul, K., Proper, H.A.: On transforming DEMO models to ArchiMate.
In: Bider, I., et al. (eds.) BPMDS/EMMSAD -2012. LNBIP, vol. 113, pp. 270–284. Springer,
Heidelberg (2012). https://doi.org/10.1007/978-3-642-31072-0_19
8. Pijpers, V., Gordijn, G., Akkermans, H.: E3alignment: exploring inter-organizational align-
ment in networked value constellations. Int. J. Comput. Sci. Appl. 6(5), 59–88 (2009)
9. Ettema, R., Dietz, J.L.G.: ArchiMate and DEMO – mates to date? In: Albani, A., Barjis, J.,
Dietz, J.L.G. (eds.) CIAO!/EOMAS -2009. LNBIP, vol. 34, pp. 172–186. Springer, Heidelberg
(2009). https://doi.org/10.1007/978-3-642-01915-9_13
10. Caetano, A., Assis, A., Tribolet, J.: Using DEMO to analyse the consistency of business pro-
cess models. In: Moller, C., Chaudhry, S. (eds.) Advances in Enterprise Information Systems
II, pp. 133–146. Taylor & Francis Group, London (2012)
11. Heller, S.: Usage of DEMO methods for BPMN models creation. Czech Technical University
in Prague (2016)
12. Mráz, O., Náplava, P., Pergl, R., Skotnica, M.: Converting DEMO PSI transaction pattern into
BPMN: a complete method. In: Aveiro, D., Pergl, R., Guizzardi, G., Almeida, J.P., Magalhães,
374 T. Gray et al.
R., Lekkerkerk, H. (eds.) EEWC 2017. LNBIP, vol. 284, pp. 85–98. Springer, Cham (2017).
https://doi.org/10.1007/978-3-319-57955-9_7
13. France, R., Rumpe, B.: Model-based development. Softw. Syst. Model. 7(1), 1–2 (2008)
14. Cicchetti, A., Ciccozzi, F., Pierantonio, A.: Multi-view approaches for software and system
modelling: a systematic literature review. Softw. Syst. Model. 18(6), 3207–3233 (2019).
https://doi.org/10.1007/s10270-018-00713-w
15. Bork, D.: A development method for conceptual design of multi-view modeling tools with
an emphasis on consistency requirements. University of Bamberg (2016)
16. Grundy, J., Hosking, J., Li, K.N., Ali, N.M., Huh, J., Li, R.L.: Generating domain-specific
visual language tools from abstract visual specifications. IEEE Trans. Softw. Eng. 39(4),
487–515 (2013)
17. Mulder, M.A.T.: Validating the DEMO specification language. In: Aveiro, D., Guizzardi, G.,
Guerreiro, S., Guédria, W. (eds.) EEWC 2018. LNBIP, vol. 334, pp. 131–143. Springer, Cham
(2019). https://doi.org/10.1007/978-3-030-06097-8_8
18. Aßmann, U., Zschaler, S., Wagner, G.: Ontologies, meta-models, and the model driven
paradigm. In: Calero, C., Ruiz, F., Piattini, M. (eds.) Ontologies for Software Engineering and
Software Technology, pp. 249–273. Springer, Heidelberg (2006). https://doi.org/10.1007/3-
540-34518-3_9
19. Dietz, J.L.G., Mulder, M.A.T.: DEMOSL-3: demo specification language version 3.7. SAPIO
(2017)
20. Perinforma, A.P.C.: The Essence of Organisation, 3rd ed. Sapio (2017). www.sapio.nl
21. Mulder, M.A.T.: Towards a complete metamodel for DEMO CM. In: Debruyne, C., Panetto,
H., Guédria, W., Bollen, P., Ciuciu, I., Meersman, R. (eds.) OTM 2018. LNCS, vol. 11231,
pp. 97–106. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11683-5_10
22. Gregor, S., Hevner, A.: Positioning and presenting design science research for maximum
impact. MIS Q. 37(2), 337–355 (2013)
23. Peffers, K., Tuunanen, T., Rothenberger, M., Chatterjee, S.: A design science research
methodology for information systems research. J. MIS 24(3), 45–77 (2008)
24. Leffingwell, D.: Agile Software Requirements: Lean Requirements Practices for Teams,
Programs, and the Enterprise. Addison-Wesley, New Jersey (2011)
25. Nassar, V.: Common criteria for usability review. Work 41(Suppl 1), 1053–1057 (2012)
26. Mulder, M.A.T.: Enabling the automatic verification and exchange of DEMO models. Ph.D.
thesis (n.d.)
27. Grigorova, K., Mironov, K.: Comparison of business process modeling standards. Int. J. Eng.
Sci. Manag. Res. 1(3), 1–8 (2014)
28. Recker, J., Wohed, P., Rosemann, M.: Representation theory versus workflow patterns – the
case of BPMN. In: Embley, David W., Olivé, A., Ram, S. (eds.) ER 2006. LNCS, vol. 4215,
pp. 68–83. Springer, Heidelberg (2006). https://doi.org/10.1007/11901181_7
29. Object Management Group: Business process model & notation. https://www.omg.org/bpmn/.
Accessed 30 May 2019
30. Bork, D., Buchmann, R.A., Karagiannis, D., Lee, M., Miron, E.-T.: An open platform for
modeling method conceptualisation: the OMiLAB digital ecosystem. Commun. AIS 44(32),
673–697 (2019)
Reference Method for the Development
of Domain Action Recognition Classifiers:
The Case of Medical Consultations
1 Introduction
Machine Learning (ML) and Artificial Intelligence (AI) have been introduced to
many different industries and fields of research to automate many tasks, includ-
ing the recognition and classification of human actions. Large, context-specific
datasets are needed to train, validate and test the classifiers, but not every avail-
able dataset can be used for every purpose [24]. When a specific classification
task needs to be executed, chances are that no relevant dataset exists.
We focus on human action recognition: the classification problem of “label-
ing videos containing human motion with action classes” [20]. Thanks to the
advancements in action recognition, researchers are now able to analyze more
c Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 375–391, 2020.
https://doi.org/10.1007/978-3-030-49418-6_26
376 S. Molenaar et al.
2 Research Method
Techniques, methods and processes for data analysis and ML already exist, but
are not tailored to the specific purpose of developing domain action recognition
classifiers1 . Typically, ML literature assumes the reader has some knowledge or
experience with ML. While they tend to have an implicit method, they pre-
dominantly explain how certain algorithms can be implemented and domain
understanding is assumed when a case study is described. Therefore, we use an
assembly-based method engineering approach, resulting in the following research
question: “How can a reference method be assembled for the development of
domain action recognition classifiers? ”
Ralyté et al. [21] distinguish three main activities in their assembly-based
process model: (i) specify method requirements, (ii) select method chunks and
(iii) assemble chunks. In our case, the method requirements are as follows. First,
the method should provide guidance to practitioners in information systems,
rather than ML experts. Second, we focus only on action recognition classifiers.
Third, we are concerned with classifiers for a given domain: the method should
cover the entire process from domain understanding to deploying the classi-
fier. However, the method should be domain-independent, i.e., applicable to any
domain in which action recognition is used.
The reference method aims to provide a structured overview of the activities
and deliverables and consistent terminology [27]. We hope our method mitigates
the risk of introducing errors throughout the process. Also, since all activities
1
Throughout this paper, ‘action recognition’ stands for both action and interaction
recognition.
Domain Action Recognition Reference Method 377
3 Related Works
To the best of our knowledge, no systematic approach exists that describes the
development of action recognition classifiers from problem statement to deploy-
ment. Thus, we start from data science frameworks and assemble a reference
method for building effective classifiers for action recognition. We compare three
processes: the Common Task Framework (CTF) [6], the Knowledge Discovery
in Databases (KDD) [7] process, and the CRoss Industry Standard Process for
Data Mining (CRISP-DM) [28].
CTF is meant for predictive modeling, making it suitable for classifier devel-
opment. However, it only includes three main elements according to Donoho [6]:
(i) a publicly available training dataset with feature measurements and labels,
(ii) competitors that infer class prediction rules from the data and (iii) a referee
that receives the predictions compares them to the test data and returns the
prediction accuracy. This framework, however, requires an existing dataset. In
addition, competing teams are needed to conduct the common task, which may
not be readily available and/or willing to participate.
The KDD process describes the following steps: data selection, pre-
processing, transformation, data mining and interpretation/evaluation. The pro-
cess was designed for use in the data mining field [7]. KDD differentiates itself by
focusing on the entire knowledge discovery process. Unfortunately, the process
does not start with a specific problem that needs to be solved. Moreover, it does
not address deployment of the process in another system, since the result of the
process is knowledge that can be used.
Thirdly, CRISP-DM is selected, since it was designed for projects with large
amounts of data and achieving specific business-related objectives. The CRISP-
DM process model prescribes six phases [28]:
6. Deployment: presenting the results in such a way that the customer can use
them, for instance by writing a report.
The application to specific domains requires the classifier to be trained for each
domain, which requires the method to be able to handle large amounts of data.
Furthermore, the classifier should be deployed in the domain context or system
so that it can be used for prediction tasks by the stakeholder(s). Therefore, the
CRISP-DM process is used as a starting point, as it fits best the requirements
for the reference method.
1..1
SEQUENCE
Action
SEQUENCE
1. Domain
understanding 1..*
Extract domain
actions 1..*
1..* 0..*
DOMAIN
Software DOMAIN
Software
Action
ACTION
artifact GUIDELINE
artifact
Discuss with
domain 1..1
professionals
1..*
CAMERA 1..1 1..*
records 1..*
ANGLE Software
RECORDING Software
SESSION
1..* 1..* artifact artifact 1..*
2. Dataset [else] [if dataset
creation available] 1..*
1..*
Divide domain
actions into
sequence
Plan variation
VARIATION
HEURISTIC
2..*
Record videos SUBJECT
- gender:
string provides with
3. Dataset
preparation 4..4
Edit ANNOTATION
Edit
recordings
recordings
Assess
annotation [else]
SHALLOW
SHALLOW DEEP
SHALLOW ANGLE
METHOD METHOD ANGLE
METHOD METHOD SUBJECTS'
SKELETON
SKELETON
[sufficient]
4. Classifier c
modeling DISTANCE DISTANCE
DISTANCE
Select method SELECTED SUBJECTS'
SKELETON
SET SKELETON
LEARNING
1..1
METHOD o
1..*
Create
Createfeature
feature REPRESENTATION DEEP NETWORKS
sets
sets BASED FEATURE BASED FEATURE
determines c
SET SET
Select classifier
Create c
Create
experimental
experimental determines 1..*
protocol
protocol 1..*
FEATURE
SET
1..*
[100% - x =
1..1
SINGLE
5. Classifier Tr. set]
FRAME
training
c DATASET 1..1
[k-fold Train Classifier
validation] FRAME
SEGMENT predicts
Use k-fold 1..*
[x - y = V. set]
cross-validation divides
TRAINING 1..1 trains and
SET validates
Validate Classifier 1..*
Provide feedback
Test Classifier FEEDBACK 0..* 1..1 DOMAIN Open OPEN
FEEDBACK DOMAIN
generates SYSTEM activity CONCEPT
[if room for SYSTEM
improvement]
Closed CLOSED
CLOSED
[else] Closed
activity CONCEPT
CONCEPT
activity
In the data preparation phase, the CRISP-DM model prescribes data cleaning
and formatting. To prepare for supervised learning, the instances (actions) in the
data should be labeled [13]. So, after recording, the videos should be prepared
for training in the dataset preparation phase. If multiple cameras were used,
the videos should be edited, such that they have the same dimensions and are
uniform in terms of brightness, frame rate, etc. This is done to ensure that the
variations in formatting do not affect the predictions. Besides, data augmentation
can be applied during the editing activity, which means adding noisy versions
of existing data to increase the size of the dataset. The augmentation approach
depends on the data. For instance, when working with imagery, it is possible to
mirror the existing images to create additional data. Alternatively, 3D synthesis
can be utilized to generate synthetic data from real data, also with the intent
of increasing the volume of data [25]. Then, recordings need to be cut into
different sessions, since recordings can span over multiple sessions (to save time
while recording). Subsequently, they can be annotated (supplied with the correct
labels), so that the ground truth is available for training the classifier.
For the annotation, we distinguish four different labels: posture of the sub-
ject, region of interest, domain action and touch. Since the former three should be
defined for a specific domain, they have not been extended with specific options.
According to Moeslund et al. [16], concepts like action, activity and behavior
are sometimes considered synonyms. In this method, however, we adopt their
action taxonomy: (i) action primitives, which are atomic entities that comprise
an action, (ii) actions, a set of action primitives that are needed to perform the
particular action and (iii) activities, a collection of actions that describe a larger
event. For example, ‘playing tennis’ is an activity, that contains action such as
‘serve’ and ‘return ball’, in which the latter consists of action primitives like
‘forehand’, ‘run right’ and ‘jump’ [16]. Preferably, the quality of the annotations
is assessed. As an example, Mathias et al. [15] illustrated that re-annotating data
with strict and consistent rules, as well as adding ‘ignore’ tags to unrealistically
difficult samples, can have a significant impact on the assessment of classifiers.
Incorrect annotations counted as errors created an artificial slope on the assess-
ment curves and biased the results. Multiple labels for individual items, also
referred to as repeated labeling, was also proven to be valuable [23].
Subsequently, in the classifier modeling phase, a learning method is
selected. Kong and Fu distinguish two types of methods: shallow and deep.
Action recognition predominantly makes use of deep methods, with shallow
methods being better suitable when small amounts of training data are avail-
able [12]. Herath et al. divide action recognition in two similar categories, namely
representation-based solutions and deep networks-based solutions [10]. For the
former, some sort of representation, e.g., keypoints, silhouettes, 2D/3D models,
are required in order to train, validate and test the classifier. If a deep method is
selected, there are three different options: (i) if a large, labeled dataset is avail-
able, it is possible to perform end-to-end training of a deep neural net, (ii) if
the dataset has some labeled data, data augmentation or transfer learning can
be applied, i.e., employing large datasets to train deep networks and then using
382 S. Molenaar et al.
testing are conducted k times. After training and validation is completed using
all selected parameter sets, the parameter set that yields the best performance
is selected for testing the classifiers. No specific parameter sets are described,
since these may vary between domains. It should be noted that the subsets of the
dataset should under no circumstances overlap. However, it is possible to join
the training and validation sets for a final round of training before reporting
results on the test set. In some cases, dataset creation and classifier training are
performed iteratively, until a certain accuracy level is reached. If the accuracy
level on the validation set is unacceptable, there are four options: (i) re-train
the classifier using k-fold cross-validation, for example, (ii) creating different
feature sets and/or selecting a different classifier, (iii) editing or augmenting the
recordings in order to increase the size of the dataset or creating synthetic data
and (iv) creating additional data from scratch.
If the classifier is sufficiently accurate, it can be used in its intended domain
during the deployment phase. If not, the classifier should not be considered
for deployment. Instead, an attempt can be made to identify more appropriate
feature sets or a better suiting classifier. Context systems are systems used in
the specific domain. The results of the classifiers might need to be included
in an existing system, for instance. Finally, end-users of the classifier might
want or need to provide feedback on its performance. If users notice that one
specific action is often classified as another, additional data and training might be
needed. This is in accordance with the monitoring and maintenance plan task
included in the CRISP-DM process [28]. Since feedback cannot be predicted
beforehand, it is assumed all phases in the process can be affected.
on the instruments available for this study and whether the actions are of a sen-
sitive nature. For the former, actions such as performing an ECG were excluded,
since we did not possess such instruments. For the latter, actions that require
subjects to (partially) undress were excluded to preserve the privacy of subjects
participating in the recordings. The following medical actions were included in
the dataset, in decreasing occurrence order in the guidelines [22]: Blood pressure
measurement (BPM), Palpation abdomen (PaA), Percussion abdomen (PeA),
Auscultation lungs (AL), Auscultation heart (AH), and Auscultation abdomen
(AA).
In addition, there is a ‘no action’ class, in case no action is occurring in a
segment of the video. We also distinguish the following classes: ‘sitting upright’,
‘laying down’ and ‘laying down with knees bent’ (posture of patient), whether
the GP touches the patient or not (distance to patient) and ‘arm’, ‘chest’, ‘upper
back’ and ‘abdomen’ (region of interest). Actions, however, are not always per-
formed in isolation, but may be part of a sequence. In discussion with medical
professionals, and taking into account the guidelines, the most frequently occur-
ring combinations and order within those combinations were used most often,
with some slight variation to mitigate overtraining on specific sequences. Using
the previously mentioned taxonomy [16], we distinguish the following examples:
(i) ‘pressing with the hand’ and ‘releasing pressure of the hand’ as action prim-
itives, (ii) ‘palpation of the abdomen’ as action and (iii) ‘physical examination’
or ‘medical consultation’ as activity.
2. Dataset creation. The videos [22] were recorded with four subjects (three
female, one male), who all played the roles of both GP and patient. Since GPs
examine one patient at a time, exactly two subjects appear in all videos. In
68% of the videos, both subject were female, in 15% only the GP was female
and in 17% only the patient was female. In addition, they all changed clothes,
hair and jewelry. The GP is required to wear their hair up, but the patient
can have any hairstyle. The same is true for jewelry, since GPs are not allowed
to wear any jewelry while patients are. It should also be noted that, in the
Netherlands, GPs rarely wear white coats, which is why the GP role also changed
clothing throughout the recordings. Glasses were worn by both GP and patients.
During examinations, the patient is either sitting upright, laying down flat or
laying down with their knees bent. The GP performed actions from either side
of the patient and there is variation within actions. For instance, when listening
to the lungs, sometimes they started on the right and sometimes on the left.
Stethoscopes were used during recordings, both a red and black one to introduce
variation. Finally, three cameras from three different angles were used, which is
visualized in [22].
Since the GP (subject A) only performs medical actions using their hands,
the lower half of their body is not of importance and can be hidden behind the
examination table or gurney. The GP may have to move around to perform the
appropriate medical actions, which is why three cameras were used. By placing
them in different positions and thus acquiring three different angles, the chances
of occlusion are reduced. A total of 451 videos were recorded, 73.6% of which
Domain Action Recognition Reference Method 385
included a single action (those videos contained frames without any action as
well), the rest contained sequences [22].
3. Dataset preparation. All recordings were cut into sessions, 192 in total [22],
which were then annotated, meaning they were provided with the right labels.
The online annotation tool ELAN [9] was used, since it allows for annotating
multiple videos concurrently. An example of how part of a video is annotated is:
‘sitting upright’ (posture subject), ‘arm’ (region of interest), ‘true’ (touch) and
‘BPM’ (domain action).
joints makes with the horizontal axis of each camera viewpoint [29,30]. We cal-
culate angles differently from previous literature, where they use angles of two
lines between three neighbor joints. Our approach on computing angles allows
us to embed the body orientation of each subject both relative to camera view-
point and therefore to each other as well. The 2D joint coordinates (Eq. 1) are
determined in order to calculate the angles and distances:
We use J for joint and Jc for 2D joint coordinate where Jx and Jy represent the
x and y axis, respectively. Secondly, the angle between a line of two neighbor
joints and the horizontal axis (Eq. 2) is calculated as follows [18]:
J1x − J2x
LLa = LLa (LJ1 →J2 , Lh ) = arctan( ) (2)
J1y − J2y
LLa , LJ1 →J2 , Lh represent line-line angle, line between two joints, and horizontal
line, respectively. Finally, the (Euclidean) distance between two or more joints
of one subject and the distances between the joints of two or more subjects are
determined (Eq. 3) using the following equation:
−−−−→
JJd = JJd (J1 , J2 ) = ||J1 J2 || = (J1x − J2x )2 + (J1y − J2y )2 (3)
Note that in these feature sets, the upper body is considered to range from the
head to the lower abdomen (right above the keypoints of the hips).
Domain Action Recognition Reference Method 387
At first, single frames were used to recognize actions. However, some med-
ical actions are quite similar when comparing individual frames rather than a
segment of a video, because they are performed on the same region of interest.
For instance, during palpation of the abdomen, both hands are pressed on the
abdomen, while during percussion one hand is not released from the abdomen,
while the other is. Therefore, percussion is easily confused with palpation. The
same is true for auscultation of the lungs and heart. In case of the heart, only
the area of the chest around the heart is covered, while in case of the lungs the
entire chest is examined. The two are more difficult to distinguish when only
individual frames are considered. When taking into account frame segments of
videos as opposed to single frames, the accuracy of the classifier increases by
0.059, using 120 frames and a sliding window of 20 frames, to 0.756.
The use of segments increased the accuracy of the classifier when distinguish-
ing palpation and percussion of the abdomen and auscultation of the lungs and
heart. Confusion matrices illustrating the prediction accuracy of the best per-
forming feature set using both individual frames and segments of the classifier
are shown in Figs. 5 and 6, respectively. While the prediction accuracy does not
increase for all individual actions, the average prediction accuracy does. The
388 S. Molenaar et al.
Fig. 5. Confusion matrix of feature set Fig. 6. Confusion matrix of best per-
with best test accuracy (sets 3–5) of RF forming segment (120, 20) for RF (fea-
classifier. ture sets 3–5).
6 Discussion
Validity Threats. Firstly, the DARC-method was only applied to a single case
and a single domain; it may need customization when applied to other domains.
Secondly, the method was applied by people with some experience with both
Domain Action Recognition Reference Method 389
ML and classifiers and the specific domain it was used for. Additionally, we were
unable to conduct the deployment phase as of yet, meaning this part was never
tested in a real-world situation and cannot be described in detail. Given that
there are additional steps to developing the classifier, there is a risk of introducing
errors throughout the process [24]. Finally, others less familiar with the subject
matter and process may have a more difficult time applying the method to their
own case and/or domain. However, the method is based on existing and validated
techniques, which should improve its external validity.
References
1. Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: real-
time multi-person 2D pose estimation using part affinity fields. arXiv preprint
arXiv:1812.08008 (2018)
2. Colleoni, E., Moccia, S., Du, X., De Momi, E., Stoyanov, D.: Deep learning based
robotic tool detection and articulation estimation with spatio-temporal layers.
IEEE Robot. Autom. Lett. 4(3), 2714–2721 (2019)
3. Cunningham, P.: Dimension reduction. In: Cord, M., Cunningham, P. (eds.)
Machine Learning Techniques for Multimedia. COGTECH, pp. 91–112. Springer,
Heidelberg (2008). https://doi.org/10.1007/978-3-540-75171-7 4
4. Derpanis, K.G., Sizintsev, M., Cannons, K., Wildes, R.P.: Efficient action spotting
based on a spacetime oriented structure representation. In: Proceedings of the
CVPR, pp. 1990–1997. IEEE (2010)
5. Dietterich, T.: Overfitting and undercomputing in machine learning. ACM Com-
put. Surv. (CSUR) 27(3), 326–327 (1995)
6. Donoho, D.: 50 years of data science. J. Comput. Graph. Stat. 26(4), 745–766
(2017)
7. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: The KDD process for extracting
useful knowledge from volumes of data. Commun. ACM 39(11), 27–34 (1996)
8. Gudivada, V., Apon, A., Ding, J.: Data quality considerations for big data and
machine learning: going beyond data cleaning and transformations. Int. J. Adv.
Softw. 10(1), 1–20 (2017)
9. Hellwig, B.: EUDICO linguistic annotator (ELAN) version 1.4-manual. Last
updated (2003)
10. Herath, S., Harandi, M., Porikli, F.: Going deeper into action recognition: a survey.
Image Vis. Comput. 60, 4–21 (2017)
11. Jordan, M.I., Mitchell, T.M.: Machine learning: trends, perspectives, and
prospects. Science 349(6245), 255–260 (2015)
12. Kong, Y., Fu, Y.: Human action recognition and prediction: a survey. arXiv
preprint arXiv:1806.11230 (2018)
13. Kotsiantis, S.B., Zaharakis, I., Pintelas, P.: Supervised machine learning: a review
of classification techniques. Emerg. Artif. Intell. Appl. Comput. Eng. 160, 3–24
(2007)
14. Maas, L., et al.: The Care2Report system: automated medical reporting as an
integrated solution to reduce administrative burden in healthcare. In: Proceedings
of the 53rd HICSS (2020)
15. Mathias, M., Benenson, R., Pedersoli, M., Van Gool, L.: Face detection without
bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV
2014. LNCS, vol. 8692, pp. 720–735. Springer, Cham (2014). https://doi.org/10.
1007/978-3-319-10593-2 47
16. Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based
human motion capture and analysis. Comput. Vis. Image Underst. 104(2–3), 90–
126 (2006)
17. Nath, T., Mathis, A., Chen, A.C., Patel, A., Bethge, M., Mathis, M.W.: Using
DeepLabCut for 3D markerless pose estimation across species and behaviors. Nat.
Protoc. 14(7), 2152–2176 (2019)
Domain Action Recognition Reference Method 391
18. Noori, F.M., Wallace, B., Uddin, M.Z., Torresen, J.: A robust human activity
recognition approach using OpenPose, motion features, and deep recurrent neural
network. In: Felsberg, M., Forssén, P.-E., Sintorn, I.-M., Unger, J. (eds.) SCIA
2019. LNCS, vol. 11482, pp. 299–310. Springer, Cham (2019). https://doi.org/10.
1007/978-3-030-20205-7 25
19. Park, S., Trivedi, M.M.: Understanding human interactions with track and body
synergies (TBS) captured from multiple views. Comput. Vis. Image Underst.
111(1), 2–20 (2008)
20. Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput.
28(6), 976–990 (2010)
21. Ralyté, J., Deneckère, R., Rolland, C.: Towards a generic model for situational
method engineering. In: International Conference on Advanced Information Sys-
tems Engineering, pp. 95–110 (2003)
22. Schiphorst, L., Doyran, M., Salah, A.A., Molenaar, S., Brinkkemper, S.:
Video2report: a video database for automatic reporting of medical consultancy
sessions. In: 15th IEEE International Conference on Automatic Face and Gesture
Recognition, Buenos Aires (2020)
23. Sheng, V.S., Provost, F., Ipeirotis, P.G.: Get another label? Improving data quality
and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.
614–622 (2008)
24. Stergiou, A., Poppe, R.: Analyzing human-human interactions: a survey. Comput.
Vis. Image Underst. 188, 102799 (2019)
25. Varol, G., et al.: Learning from synthetic humans. In: Proceedings of the CVPR,
pp. 109–117 (2017)
26. van de Weerd, I., Brinkkemper, S.: Meta-modeling for situational analysis and
design methods. In: Handbook of Research on Modern Systems Analysis and
Design Technologies and Applications, pp. 35–54. IGI Global (2009)
27. van de Weerd, I., de Weerd, S., Brinkkemper, S.: Developing a reference method
for game production by method comparison. In: Ralyté, J., Brinkkemper, S.,
Henderson-Sellers, B. (eds.) Situational Method Engineering: Fundamentals and
Experiences. ITIFIP, vol. 244, pp. 313–327. Springer, Boston, MA (2007). https://
doi.org/10.1007/978-0-387-73947-2 24
28. Wirth, R., Hipp, J.: CRISP-DM: towards a standard process model for data mining.
In: Proceedings of the 4th International Conference on the Practical Applications
of Knowledge Discovery and Data Mining, pp. 29–39 (2000)
29. Yun, K., Honorio, J., Chattopadhyay, D., Berg, T.L., Samaras, D.: Two-person
interaction detection using body-pose features and multiple instance learning. In:
Proceedings of the CVPRW (2012)
30. Zhang, S., Liu, X., Xiao, J.: On geometric features for skeleton-based action recog-
nition using multilayer LSTM networks. In: Proceedings of the WACV, pp. 148–
157. IEEE (2017)
Evaluation-Related Research
(EMMSAD 2020)
An Evaluation of the Intuitiveness
of the PGA Modeling Language Notation
1 Introduction
to align the business strategy with the internal infrastructure and processes [7].
This enables companies to adequately react on opportunities and threats in its
external environment. The design of the PGA modeling method is the result of
different iterations of Action Design Research [17], which allowed the gradual
refinement of its syntax, semantics, and modeling procedure [15].
One of the design requirements of the PGA method is a clear communication
of the organizational strategy to ensure its understanding by business-oriented
experts [15]. These experts are not applying the method themselves, but are
guided by a modeler, who collects the necessary information and constructs the
PGA models. Therefore, ensuring that PGA models can be intuitively under-
stood by business-oriented end-users is of paramount importance to reduce the
cognitive load for them. This will foster the use of the models to identify possi-
ble organizational improvements. To realize this, the PGA notation was initially
guided by the principle of semantic transparency. This principle imposes that
the graphical notation of a modeling language element suggests its meaning [10].
However, the intuitiveness of the PGA notation was not tested yet.
This research gap can be solved by an evaluation technique for testing the
intuitiveness of DSMLs [3]. The technique comprises a set of tasks which are
divided into three phases: (i) term association, (ii) notation association, and (iii)
case study. These tasks were conducted by Master students of Ghent University
to test the intuitiveness of the PGA notation. Based on an analysis of the results,
improvements to the initial notation are proposed.
The paper is structured as follows. Section 2 reviews foundational literature
about conceptual modeling, modeling language notations, and the PGA model-
ing method. Afterwards, Sect. 3 describes how the evaluation technique and the
data analysis were performed. In Sect. 4, the results of the analysis are presented,
which leads to the proposal of an improved PGA notation in Sect. 5. The paper
ends with a reflection and concluding remarks in Sect. 6.
2 Foundations
Information processing can be divided into two steps [12]: Perceptual Processing
(seeing) which is fast and automatic, and Cognitive Processing (understanding)
which is slow and resource-intensive. Conceptual models should aim for computa-
tional offloading, i.e., replacing some cognitive tasks by perceptual ones. Moody
states that “Designing cognitively effective visual notations can [..] be seen as a
problem of optimizing them for processing by the human mind” [10, p. 761].
“The extent to which diagrams exploit perceptual processing largely explains
differences in their effectiveness” [10, p. 761]. When analyzing the perceptual
processing quality of a visual notation, one needs to consider semantic trans-
parency [10]. Semantic transparency is defined as “the extent to which a novice
reader can infer the meaning of a symbol from its appearance alone” [10, p. 765].
In literature, semantic transparency is often considered synonymous to an intu-
itive understanding. A notation with a high semantic transparency enables users
to infer the meaning of a symbol/model from their working and/or long-term
memory. Semantic transparency therefore “plays a crucial role in [...] accep-
tance” of modeling languages [5, p. 123].
PGA has been introduced in [14] and further developed in [15] as a project within
the Open Models Laboratory (OMiLAB) [2]. A tool prototype has been realized
with the ADOxx meta-modeling platform [1]. To achieve strategic fit in the busi-
ness architecture, PGA aims at the development of a business architecture heat
map following a modeling procedure that consists of three activities: (i) develop-
ing a prioritized business architecture hierarchy, (ii) executing the performance
measurement, and (iii) performing the strategic fit improvement analysis.
The first step aims to model the creation of value throughout a hierarchi-
cal structure of business architecture elements. Based on Strategic Management
frameworks, the PGA meta-model incorporates the following elements (i.e., cap-
italized in the remainder of the text): Activity, Process, Competence, Value
Proposition, Financial Structure, Internal Goal, Customer Goal, and Financial
Goal. To design an intuitive notation for business-oriented end-users, icons were
used to represent these elements. An overview of the initial PGA notation is
found in Table 4. Afterwards, valueStream relations are added between these
elements to show the hierarchical value structure. Each valueStream relation
is prioritized by using the AHP mechanism (i.e., based on pairwise compar-
isons) [16] and a color coding with accompanying line texture is used to differ-
entiate between a high (i.e., solid red color), medium (i.e., dashed orange color),
or low priority (i.e., dotted green color) w.r.t. their strategic Importance.
The performance measurement mechanism is applied to each business archi-
tecture element to identify an appropriate performance indicator, set a perfor-
mance target and an allowed deviation percentage, and to analyze the actual
outcome for each indicator. This enables the differentiation between an excel-
lent, expected, or bad Performance for each element. Following existing heat
398 B. Roelens and D. Bork
Fig. 1. Example of a business architecture heat map in PGA [14] (Color figure online)
1
The labels were manually adapted to improve readability on a limited space.
An Evaluation of the Intuitiveness of the PGA Modeling Language Notation 399
3 Methodology
3.1 Evaluation Technique
In this paper, we applied the evaluation technique of [3] to test the intuitiveness
of the PGA notation by the intended end-users2 . The participants were given
60 min to complete the evaluation questions. This evaluation comprised a set of
tasks, which were clustered in three core phases surrounded by an initiation and
a conclusion phase (see Fig. 2).
to divide the participants into two groups. Group A had half of the concepts
as part of the term association and the other half as part of the notation asso-
ciation. For group B, the order of the phases was the same, but the concepts
were divided oppositely between the two tasks.
Phase 3 – Case Study. This task included comprehension questions targeting
an example of a business architecture heat map (see Fig. 1). The legend is
added here for clarification but was not provided to the respondents. Each
question was oriented towards the identification of particular meta-model
elements in the model, e.g., how many elements have a good performance,
which type of element is supported by Operational excellence, etc.
Concluding Phase. As a last step, participants were asked to provide qualita-
tive feedback and improvement suggestions about the current PGA notation.
All participant responses were digitized and stored in a shared cloud infras-
tructure. All authors started a pretest for analyzing the results with only a few
responses. Afterwards, the gained experience was exchanged to streamline the
structure of the analysis, e.g., the visual variables to be applied during the clas-
sification of the term associations. Next, the authors independently analyzed all
responses, after which the analysis was condensed toward a harmonized result.
4 Evaluation Results
4.1 Participants
The participants were students following a Master level class on IT Management
at the Faculty of Economics and Business Administration of Ghent University.
An Evaluation of the Intuitiveness of the PGA Modeling Language Notation 401
In total, 139 students participated in the user study. The participants were ran-
domly assigned to two different groups (see Sect. 3.1), resulting in 70 participants
for group A and 69 for group B. Their average age was 22 years and 41% of them
were female. Although the participants were not familiar with the PGA method,
86% had some prior modeling knowledge about ER modeling, 90% about busi-
ness process modeling, and 34% about ArchiMate.
Table 2 shows the results of the notation association task. For each element, the
percentage of participants giving a matching association and the relative rank of
this association is listed. Important to note here is that the visualization of the
valueStream relation (i.e., a non-directed line, see Table 4), was not explicitly
tested as the meaning of this relation only becomes clear when included in a
hierarchical business architecture heat map.
The percentage of correct associations ranges between 0% and 36.23%. The
PGA concepts Activity (24.29% - rank 1), Process (36.23% - rank 1), Financial
Structure (12.75% - rank 3), and Financial Goal (20.29% - rank 2) perform the
best as we analyze both the percentage and the relative rank of the correct asso-
ciations. The notation of the other elements is less intuitive, as the percentages
are below 5%. Moreover, some of them are outperformed by other meta-model
elements. More specifically, the Competence notation (i.e., a stage icon) is con-
An Evaluation of the Intuitiveness of the PGA Modeling Language Notation 403
fused with Performance by 52.17% of the participants and the icon of Internal
Goal (i.e., a cogwheel) is associated with a Process by 35.29% of the participants.
In Table 3, the results of the case study are given. To keep this example man-
ageable for participants in the given time, only one type of Goal (i.e., Financial
Goal) was included in the sample model. This is the reason why no results
are available for Customer and Internal Goal in Table 3. Although all questions
were oriented towards the identification of meta-model elements, partially cor-
rect answers could also be identified. These include naming elements at the
instance level (e.g., Take sample instead of Activity) or using close synonyms for
the meta-model element (i.e., Task as a synonym for Activity). Besides, there
was not a question that directly targeted the identification of a valueStream,
but problems with the intuitiveness of this relation can be derived from incor-
rect answers to the questions about the Activity and Value Proposition concept.
More specifically, some incorrect answers indicate that the valueStream relation
was interpreted in the wrong direction.
Although the mean score of complete correct answers for this task is 41.32%,
Table 3 shows that the meaning of the Value Proposition (i.e., 5.04% correct
answers) and Importance (i.e., 5.76% correct answers) notation cannot easily be
derived from the business architecture heat map. Even if partially correct answers
are included, these elements are the two least performing of all PGA concepts
with total scores of 21.59% for Value Proposition and 14.39% for Importance.
Besides, there seems to be a problem with the intuitiveness of the valueStream
notation, which was read in the wrong direction in the Activity and Value Propo-
sition question by respectively 18.71% and 27.34% of the participants. As one
can notice, the scores for Financial Structure and Financial Goal are the same,
404 B. Roelens and D. Bork
During the conclusion phase, we obtained 104 remarks from 58 unique partici-
pants (i.e., a response rate of 41.73%). Of the responses, 45 could be specifically
traced back to the PGA meta-model, distributed among the aspects color and
line style (24 remarks), Importance (12 remarks), valueStream (5 remarks), and
Activity (4 remarks). As can be seen in Table 4, color and line style refer both to
Performance and Importance in the PGA meta-model. We provide illustrative
feedback in the following.
– Color & line style: “Using colors is a good idea, it gives a nice and quick
overview.”
“The meaning of the different colors & line styles is not clear.”
– Importance: “It is not clear what the numbers next to the relations mean.”
– valueStream: “It is difficult to see where certain value streams go to.”
– Activity: “The model would improve if the total process of how the organi-
zation operates was represented.”
We conclude the design cycle by proposing directions for improving the PGA
notation. This proposal is based on the combined evaluation results discussed
previously. In particular, we distinguish between (i) no change is required and
An Evaluation of the Intuitiveness of the PGA Modeling Language Notation 405
(ii) the suggestion of a new notation. In the first case, no change is required
as the results confirm the intuitiveness of the initial notation. In the latter, we
use (some of) the suggestions of the participants to propose a new notation. For
some elements, this also includes changes aimed at the homogenization of the
notations of all PGA elements.
For the PGA elements Activity, Process, Financial Structure, Financial Goal,
and Performance, we propose to preserve the initial notation based on the anal-
ysis of the results. The notation association, case study, and qualitative feedback
confirm the intuitiveness of these elements. Moreover, the suggested notations
of the term association phase only include generic shapes (i.e., rectangle and
ellipse) with no or recurring icons (i.e., dollar/euro sign). Following these sug-
gestions would have a negative impact on the perceptual discriminability between
Activity and Process on the one hand, and Financial Structure and Financial
Goal on the other hand. Besides, we understand the qualitative feedback about
the lack of a complete process description in the PGA models. However, this
is a deliberate design choice of the modeling method as the main purpose of
the business architecture heat maps is to achieve alignment between the differ-
ent layers. Therefore, it is not always needed to offer a complete view on the
business architecture, as this may hamper the understanding of the models [15].
Performance is an exception in the analysis, as it combines a low score for the
notation association (i.e., 0%) with a score of 81.29% for the case study. This
can be explained by the fact that Performance is implemented as an attribute
to the other PGA meta-model elements. Consequently, the meaning of the color
coding only becomes intuitive when implemented in a complete business archi-
tecture heat map (see Fig. 1). Qualitative feedback further confirmed that the
use of color enables to give a nice and quick overview of alignment opportunities
in the business architecture.
The main argument to propose a new notation for Competence is the con-
fusion that the initial one causes for end-users. Indeed, during the notation
association phase, it became clear that people naturally attach the meaning of
Performance to the visualization. Based on the suggestions of the participants
during the term association task, we propose a combination of a person and light
bulb icon as the new notation (see Table 4). This notation should refer to the
cognitive abilities that are associated with the definition of a Competence as the
internal knowledge, skills and abilities of an organization.
A new notation for Value Proposition is also proposed in Table 4, as the ini-
tial notation was one of the least performing PGA elements during the notation
association (i.e., 2.83%) and case study (i.e., 5.04%) tasks. However, the sug-
gested icons by participants do not show a clear preference as they are closely
related to financial elements (i.e., dollar/euro or + sign) or cognitive abilities
(i.e., light bulb). Therefore, the new notation is a gift that is exchanged between
two hands. We believe this provides a more intuitive notation for the products
and services that are exchanged between a company and its customers. This pro-
posal is in line with the notation of a Value Proposition in the Business Model
Canvas (i.e., a gift icon) [13].
406 B. Roelens and D. Bork
No change is
Activity Blue (rounded) rectangle
required
This paper describes the execution of an evaluation technique [3] to test the
intuitiveness of the initial PGA notation [15]. This evaluation was needed to
validate the communication potential of PGA and to improve the understand-
ing and acceptance of the resulting models by business-oriented end-users. The
evaluation tasks were performed by 139 Master’s students of Ghent University
with an elaborate economical background and basic modeling experience. The
408 B. Roelens and D. Bork
analysis of these tasks and the qualitative feedback led to the proposal of an
alternative notation for six of the 11 elements of the PGA modeling method.
This research is not free from threats to validity [18]. To preserve construct
validity, it is important to ensure that the executed tasks are suited to eval-
uate the intuitiveness of a DSML. Therefore, we applied an existing evaluation
technique, for which the origin of the tasks is rigorously substantiated [3]. With
respect to internal validity, external factors that influence the results need
to be avoided. In this respect, participants were chosen with the same educa-
tional background (i.e., Master’s students in Business Engineering and Business
Administration). Besides this, the participants had similar foreknowledge in con-
ceptual modeling and received a collective introduction to PGA. Furthermore,
participation was voluntarily and no compensation was provided. Finally, we
used two different randomly assigned groups and divided the PGA concepts
between the term association and notation association tasks to mitigate an allo-
cation bias. In this way, we made sure that the terms given during the first task
did not influence the associations of the notation association phase. The choice of
participants also affects the external validity or generalizability of the results.
The students have a strong economic orientation which enabled us to obtain
a group of respondents with knowledge and skills that can act as a proxy for
business-oriented stakeholders. These stakeholders are the targeted end-users of
the PGA modeling method. Nevertheless, the choice for students is an inherent
limitation and further research is needed to replicate the evaluation technique
with business practitioners. Reliability reflects the degree to which the results
could be reproduced by the modeling community. To ensure this, the procedure
that was used to apply the evaluation technique and the URL of the evaluation
questionnaires can be found in Sect. 3.1. Finally, we added the details about the
analysis of the different evaluation tasks in Sect. 3.2.
Future research is needed for the evaluation of the proposed improvements.
This includes an experiment, in which the intuitiveness of the initial and newly
proposed notation is compared. Such an experiment could be based on recall
and comprehension questions, which compare the effectiveness and efficiency of
interpreting both versions of the PGA notation [4]. Nevertheless, more research is
needed to set-up a rigorous experimental design. In this respect, we are currently
implementing the new version of the notation to become part of a future version
of the PGA modeling tool. The new tool shall be made available through the
PGA project space within the OMiLAB3 of the Open Models Laboratory [2].
On a separate research stream, we will investigate possibilities of automat-
ing the applied evaluation technique [3]. In this respect, we aim to set up a
web-environment that automatically generates the evaluation sheets once the
concepts and sample notations are uploaded. Moreover, it shall provide a WYSI-
WYG web editor for drawing notations and storing them. To support the anal-
ysis of the collected data, this system shall use OpenCV or similar technologies
to automatically analyze the created proposals for new notations. Besides this,
3
PGA project space within OMiLAB [online], https://austria.omilab.org/psm/
content/PGA/info, last accessed: 04.03.2020.
An Evaluation of the Intuitiveness of the PGA Modeling Language Notation 409
enabling text analysis could be useful for the results of the notation associa-
tion task as well as implementing statistical analysis of the responses and the
automated generation of evaluation reports. Ultimately, the web-environment
will increase the possibilities of testing a modeling language comprehensively,
as it enables an efficient set-up, execution, and analysis of the evaluation. Con-
sequently, it will mitigate issues related to the paper-and-pen evaluation of a
tool-based modeling language.
References
1. ADOxx.org: ADOxx Metamodelling Platform (2020). https://www.adoxx.org/
live/home. Accessed 15 Jan 2020
2. Bork, D., Buchmann, R.A., Karagiannis, D., Lee, M., Miron, E.T.: An open plat-
form for modeling method conceptualization: the OMiLAB digital ecosystem. Com-
mun. Assoc. Inf. Syst. 44, 673–697 (2019)
3. Bork, D., Schrüffer, C., Karagiannis, D.: Intuitive understanding of domain-specific
modeling languages: proposition and application of an evaluation technique. In:
Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M. (eds.) ER 2019. LNCS,
vol. 11788, pp. 311–319. Springer, Cham (2019). https://doi.org/10.1007/978-3-
030-33223-5 26
4. Burton-Jones, A., Wand, Y., Weber, R.: Guidelines for empirical evaluations of
conceptual modeling grammars. J. Assoc. Inf. Syst. 10(6), 495–532 (2009)
5. El Kouhen, A., Gherbi, A., Dumoulin, C., Khendek, F.: On the semantic trans-
parency of visual notations: experiments with UML. In: Fischer, J., Scheidgen,
M., Schieferdecker, I., Reed, R. (eds.) SDL 2015. LNCS, vol. 9369, pp. 122–137.
Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24912-4 10
6. Frank, U.: Domain-specific modeling languages: requirements analysis and design
guidelines. In: Reinhartz-Berger, I., Sturm, A., Clark, T., Cohen, S., Bettin, J.
(eds.) Domain Engineering, pp. 133–157. Springer, Heidelberg (2013). https://doi.
org/10.1007/978-3-642-36654-3 6
7. Henderson, J., Venkatraman, N.: Strategic alignment: leveraging information tech-
nology for transforming organizations. IBM Syst. J. 38(2–3), 472–484 (1999)
8. Karagiannis, D., Kühn, H.: Metamodelling platforms. In: Bauknecht, K., Tjoa,
A.M., Quirchmayr, G. (eds.) EC-Web 2002. LNCS, vol. 2455, pp. 182–182.
Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45705-4 19
9. Karagiannis, D., Mayr, H.C., Mylopoulos, J.: Domain-Specific Conceptual Mod-
eling - Concepts, Methods and Tools. Springer, Cham (2016). https://doi.org/10.
1007/978-3-319-39417-6
10. Moody, D.: The “physics” of notations: toward a scientific basis for constructing
visual notations in software engineering. IEEE Trans. Softw. Eng. 35(6), 756–779
(2009)
11. Mylopoulos, J.: Conceptual Modelling and Telos. Conceptual Modelling,
Databases, and CASE: an Integrated View of Information System Development,
pp. pp. 49–68. Wiley, New York (1992)
12. Newell, A., Simon, H.A.: Human Problem Solving, vol. 104. Prentice-Hall, Engle-
wood Cliffs (1972)
13. Osterwalder, A., Pigneur, Y., Tucci, C.: Business Model Generation: A Handbook
for Visionaries, Game Changers, and Challengers. Wiley, Hoboken (2010)
410 B. Roelens and D. Bork
14. Roelens, B., Poels, G.: The creation of business architecture heat maps to support
strategy-aligned organizational decisions. In: 8th European Conference on IS Man-
agement and Evaluation (ECIME), pp. 388–392. Acad. Conferences Ltd. (2014)
15. Roelens, B., Steenacker, W., Poels, G.: Realizing strategic fit within the business
architecture: the design of a process-goal alignment modeling and analysis tech-
nique. Softw. Syst. Model. 18(1), 631–662 (2019)
16. Saaty, T.: How to make a decision: the analytic hierarchy process. Eur. J. Oper.
Res. 48(1), 9–26 (1990)
17. Sein, M., Henfridsson, O., Purao, S., Rossi, M., Lindgren, R.: Action design
research. MIS Q. 35(1), 37–56 (2011)
18. Wohlin, C., Runeson, P., Höst, M., Ohlsson, M.C., Regnell, B., Wesslén, A.: Exper-
imentation in Software Engineering. Springer, Heidelberg (2012). https://doi.org/
10.1007/978-3-642-29044-2
Does Enterprise Architecture Support Customer
Experience Improvement? Towards
a Conceptualization in Digital Transformation
Context
1 Introduction
Digital transformation (DT) is proliferating in organizations around the world [1b].
Increasingly demanding customers are pushing digital competition to its edges [8]. Many
organizations operating in the products and services marketing, involving direct interac-
tions with customers, invest considerably in digital transformation [1b, 2b]. One of the
main reasons for these investments is the ability of DT to improve customer experience
[3b, 4b, 8, 9b]. Banks are, for example, among the first industries engaging DT projects
[36]. Changing consumer habits and the new competitive environment, are forcing banks
to urgently deal with their customer process so as not to be left behind in a rapidly fluc-
tuating market. Customers expect financial services to be available 24 h a day, seven
© Springer Nature Switzerland AG 2020
S. Nurcan et al. (Eds.): BPMDS 2020/EMMSAD 2020, LNBIP 387, pp. 411–427, 2020.
https://doi.org/10.1007/978-3-030-49418-6_28
412 M. Hafsi and S. Assar
days a week, and as user-friendly as social networks or messaging solutions that they
use every day [4b, 5b, 6b].
Thus, customer experience is seen as the new competitive field of marketing. The
consulting firm Gartner pointed out that 57% of customers stopped buying from a com-
pany because a competitor offered a better experience [37], and that 67% of customers
are willing to pay more for better customer experience. However, digital projects con-
cerning improving customer experience often remain at the starting point [36, 37]. This
is due to the complexity of implementing this transformation; indeed, it is a complex
process impacting several areas and components of the organization [4b, 8b]. It involves
managing the volatile behavior of customers, understanding their complex data [7b,
16b, 17b, 18b, 19b], carrying out numerous optimizations of customer processes [14b],
transforming business models [6b, 13b, 14b], integrating various digital technologies
[6b, 7b, 8b] and adapting to changing business conditions [8b, 12b, 14b].
This implementation is even more complex since organizations do not start from a
blank page to design their customer experiences. Many have already established cus-
tomers, processes, and assets that require reorientation to carry out strategies specific to
the customer experience [5b]. Drawing on our expertise in digital transformation con-
sulting, managers suffer from the absence of tools helping to define a reachable target
while considering their existing environment. Consultants often tend to start their ana-
lyzes from a blank sheet and to design ideal transformative customer experiences almost
independent of the real context that the company operates.
Among the techniques that appeared in recent years to support such transforma-
tions, Enterprise Architecture (EA) and EA Management (EAM) seem to be essential
[1, 4, 5]. While EA describes the fundamental structures of an organization, EAM is
believed to support transformations management by guiding the necessary coordination
efforts [3] and providing information for strategy development [4, 5, 7]. It also provides
Enterprise Models (EM) to various stakeholders in transformation projects and enhances
communication by establishing shared and mutual understandings [5]. Likewise, EAM
can guide decision processes and contributes to better design choices that align with the
operational and strategic goals of the transformation endeavor [6, 10, 11, 16, 17].
Nevertheless, and according to our consulting experience, EAM is not commonly
applied. It is rarely perceived as a support service for digital projects, especially projects
concerning experience client improvement. There is a tendency to consider EAM as a
discipline mostly about IT and located in the IT departments [23]. Customer Experi-
ence Improvement (CEI) projects are, however, more profound and broader than an IT
transformation and could impact commercial processes and business models [8, 9, 5b,
10b].
We tend to consider that there is a severe gap between the information offered by
EAM and the managers’ demands in digital transformation projects. Architects seem
not to know how to support CEI project managers, and these managers are not aware
of how EAM might support their effort [23]. For this paper, we try to provide the first
step towards a better understanding of EAM support for CEI in a digital transformation
context. This leads to the research question:
Does Enterprise Architecture Support Customer Experience Improvement? 413
• RQ2: What are the content elements that Enterprise Models can provide to cover these
necessary information needs?
2 Related Work
Many research studies have stated that EAM can address partial problems within DT
from a management point of view. In [15], the authors consider EAM as a governing
414 M. Hafsi and S. Assar
tool that helps mastering the alignment of portfolios of transformation steps. They also
claim potential capacities in different fields, such as strategic direction, gap analysis,
strategic planning, and operational planning. In [17], the focus is on the strategic change
process and how EAM can support it. The author sees that the strategic fit with the
market environment and business-IT alignment can be, presumably, supported by EAM.
Moreover, EAM can help in preparing the change by standardizing and modularizing
parts of the enterprise.
Over the years, studies have associated several benefits with EAM. These are gen-
erally indirect, large-scale, and perceived over a long period, which generally makes it
difficult to calculate an exact return on investment (ROI) [35]. However, in the occasional
cases where the ROI has been calculated, the results seem remarkable [35]. Among
these benefits, we highlight: increasing flexibility, integration and interoperability [7,
15, 17, 19–22]; better alignment of IT with business [1, 3, 10, 11, 15, 17, 19, 22]; IT
costs reduction [1, 3, 15, 17]; improved risk management, situational awareness and
decision-making [3, 9, 16, 21]; better results from strategic business initiatives [10, 11,
21, 38].
Other recent studies (e.g. [21, 22]) examine the evolution of modeling languages and
techniques to make them better adapted to the new age of digital transformation. They
assume that during enterprise transformations, companies need shared understanding
and agreement on topics such as the overall strategy of the enterprise, the existent pro-
cesses, as well as the future vision of the top management. However, when enterprise
modeling languages were developed, the digital transformation challenges were not yet
that noticeable. At that time, the focus was more on consolidation and optimization [38,
39]. As such, it is logical to expect that the existing languages may require some updates
concerning new element content to be truly ready for modeling the digital transformation
impacts on the organization ecosystem [23].
To conclude, many studies focus on how EAM can support transformations manage-
ment from an EAM point of view (e.g. [10, 15–18]). However, the demand perspective of
CEI projects in a digital transformation context is not available in the current discussion.
Thus, we will investigate which information inputs the demand side needs and if current
EAM can provide them.
3 Research Approach
To answer our research questions, we proceeded in three steps (Fig. 1):
Research Identification
For this paper we investigated the research question RQ described in the introduction.
Search Strategy
We developed the terms related to the research questions; the aim is identifying synonyms
for these terms by leading several tests. We used the Boolean operators (OR, AND) for
connecting the founded terms. We used strings for automated search: (“digital transfor-
mation” OR “digitalization”) AND (“customer experience” OR “consumer experience”
OR “client experience”). We conducted the search of articles by using Scopus database.
The search started on 25 of January 2020.
Study Selection
We have included papers that respect the following criteria: a) written in English; b)
published in a scientific journal; c) it deals with digital transformation; d) documents
which weren’t accessible were excluded, as well as, master and doctoral theses, pro-
ceedings or conference articles, working papers and textbooks. This choice of journal
articles respects the position of [26], who claims “academics and practitioners a like
use journals most often for acquiring information and disseminating new findings and
represent the highest level of research”.
Quality Assessment
Based on the works of [27–29], we assessed the rigor and relevance of the selected
articles. We used criteria’s such as clear description of the context in which the research
was carried out, precise statement of research aims, high level of rigor in conducting the
data collection and analysis, relevance of the findings and the extent to which the study
is valuable for research or practice. The assessment was conducted by both authors and
each paper was given a quality score. At the end of this process, we had qualified 19
articles to be analyzed for the data extraction step. These articles are numbered 1b, 2b,
etc. and appear separately in an online appendix 1 .
Data Extraction
We extracted data from the qualified articles and categorized it according to the model
proposed by [8]. The authors of this work have defined three blocks of digital transfor-
mation impact on the customer experience (Customer understanding, Top Line Growth,
and Customer Touch point). Thus, from each paper, we extracted the data requirements
that were considered necessary to carry out these transformations and categorized them
according to the three proposed blocks.
3.2 Step 2: Defining What Are the Content Elements that EAM Can Provide
by Analyzing ArchiMate Meta-Model
In a second step, we analyze and then conceptualize the information inputs that EAM
can provide to the CEI projects. We relied for our work on the content meta-model
of ArchiMate 3.0. ArchiMate is a mature industry standard that, on the one hand, is
maintained by companies and research partners; on the other hand, it is often used as a
foundation for many corporate EAM frameworks [29]. ArchiMate provides a conceptual
macro overview of the information that EAM can provide and thus allows for a more
generic discussion. Again, we ensured reliability and validity by comparing the identified
content elements with other meta-models like TOGAF [2], GERAM [30], Zachman [31],
DODAF [32] and IEEE [33].
3.3 Step 3: Mapping Results of Steps 1 and 2 Using Focus Group Technique
After identifying the needed information inputs of ECI projects and the available infor-
mation outputs of EAM, we mapped both in a third step. Major challenges were the
different languages apparent in both disciplines that inhibited a straightforward one-
to-one mapping. Hence, at start, we based our first mapping test on the meta-model
specification and additional literature. Then we proceeded for a Focus Group [34] where
we presented our pre-filled mapping to six enterprise architects from a French bank who
use and master ArchiMate as a meta-model in their daily modeling activities (Table 1).
We collected feedback on our initial mapping by explaining our choices based on
literature. Then, the architects tried to analyze for each ECI information need, the con-
tent that ArchiMate meta-model could provide in terms of concepts, based on concrete
1 Available online at ResearchGate: https://doi.org/10.13140/RG.2.2.33596.18560.
Does Enterprise Architecture Support Customer Experience Improvement? 417
examples of their modeling activities. The focus group process took almost two weeks.
Three meetings (2 to 3 h for each one) were held to introduce and work on Mapping.
The process can be summarized as follows:
First, we discussed the initial results of our first literature-based mapping. One of the
co-authors led this stage. He was invited to share ideas from the literature concerning
ArchiMate modeling. This first meeting was more like an open discussion and discussed
the advantages and limitations of ArchiMate in the banking context.
Second, the architects realized out the mapping by relying on transformation projects
related to CEI. We asked each architect to define, alone, a mapping; and then we discussed
the outputs to agree on a common mapping, which was adopted by all participants at the
end of this meeting. The final mapping is accomplished through a collegial reconciliation
of the individual results, which did not differ much at the base.
In the last step the final mapping is presented by one of the co-authors to reflect col-
legially on the limits of the ArchiMate modeling. This is formulated in the “Discussion”
section of this paper.
4 Results
Based on [8] work and our data analysis, we have identified four major groups of
how digitalization transforms customer experience and the information needed to these
transformations:
across customer experience and operational processes; for examples, many retail-
ers now offer home shopping with the option to receive products by mail or in a
store (Table 4).
D) Integrating Digital Capabilities: Digital capabilities are fundamental components
for transforming customer experience. While top management and existing IT
departments are leading digital initiatives across companies, they hire new digital
skills around Big Data, real time communication, etc. During the ‘Data Analysis’
phase, we introduced this group to classify all the information input that the com-
pany needs to know about the digital capabilities it has, and how to integrate and
reuse them for other transformation needs (Table 5).
In this part, we illustrate, which information EAM can provide by following the basic
structure of the ArchiMate 3.0 content meta-model [29]. This meta-model contains
general elements that are connected in a one-to-one manner. The other elements are
differentiated into business, data, application and technology architecture.
apparent that some CEI needed information can be (almost) fully provided by EAM,
and some, almost not. We rated this on a five-point scale ranging from one “ECI needs
almost not supported by EAM” to five “full support”. In Figs. 3, 4, 5, 6, we provide the
mapping results.
Does Enterprise Architecture Support Customer Experience Improvement? 423
5 Discussion
The findings show that, from a modeling point of view, EAM has the potential to support
CEI projects. Our results further show that there are some information elements that EAM
can easily deliver since the relevant information source exists explicitly and is maintained
frequently (e.g., process, goals, or roles). Other information inputs require more analysis
and interpretation by the architects to be a valuable input to the requesting CEI projects
424 M. Hafsi and S. Assar
(e.g., digital strategy, business model). CEI required elements of information that are
well supported by EAM, have some common characteristics:
A) They do not focus on individuals but cover an overall perspective (e.g., goals, struc-
tures of the enterprise). Activities that take a social and a narrower focus would
be better documented by other disciplines like human-focused management or
psychology (e.g., customer ideologies, trends, etc.).
B) The information has a strong focus on the internal perspective of the enterprise; they
are about the organizational processes, structures, etc. Thus, data that needs to be
collected outside the company like context, business networks, market trends, cus-
tomer satisfaction or feeling, etc. are not included in the current EAM practice. Such
external information is explicitly hard to collect for EAM (because of limitations
in the meta-model), and thus, should be instead piloted by other disciplines like
marketing departments or special projects that sense for such needed information.
We also claim that EAM does not offer enough elements to describe the context of
customers and their feedback because organizations are not used to putting them
at the heart of project design. With the emergence of collaborative and agile inno-
vation methods in companies, the EAM must adapt their meta-models to consider
customer trends and their feedback before the completion of projects.
C) EAM mostly supports digital projects that are based on explicit and formal require-
ments. Inputs that are related to society, trends (socially informed knowledge, market
information), or predictive analysis are usually not supported. EAM also does not
address the confidentiality of customer information through modeling.
6 Conclusion
In this paper, we discussed how EAM could support CEI projects in terms of modeling
using ArchiMate. We contributed first with a detailed literature survey to identify the dig-
italization impact on customer experience. Our systematic literature identifies four major
groups of how digitalization transforms customer experience: Understanding Customer,
Enabling selling activities, Managing Customer Touch Points, and Integrating digital
technologies. Then, we have defined the information inputs required for these transfor-
mations to understand what CEI is comprised of, and to provide a solid foundation for
further research in the customer experience and digital transformation area. The results
show that, in general, EAM is suited to support customer experience projects in a dig-
ital transformation context. Such transformations have a strong focus on the internal
perspective of the enterprise that is based on formal requirements (e.g., organizational
structure). Nevertheless, EAM lacks support when it comes to activities that require
inputs from the environment (e.g., trends, customer needs, customer satisfaction, etc.)
or society, trends, or predictive aspects.
This work has some limitations. First, our SLR is limited to a single database and we
chose to use only the journal articles that we had access to. Second, the ArchiMate meta-
model reflects the information that Enterprise Modeling can provide but do not integrate
specific potentials that EAM as an overall framework could additionally cover (e.g.,
Does Enterprise Architecture Support Customer Experience Improvement? 425
architecture principles, best practices, etc.). We dealt with this limitation by conducting
several iterations during the Focus Group and by including additional EAM literature
during the mapping procedure. Third, we carried out the Focus Group with only banking
experts; nevertheless, we tend to believe that this work could be generalized in other
industrial fields because the customer experience has been impacted by digitalization in
the same way for all retailers (banks and others). We intend to ensure this in our future
work. Moreover, as future work, we intend to focus on the Enterprise Architecture
Support to other shapes of digital transformations.
References2
1. Ross, J.W., Weill, P., Robertson, D.: Enterprise Architecture as Strategy: Creating a
Foundation for Business Execution. Harvard Business Press, Brighton (2006)
2. TOG, The Open Group: TOGAF Version 9.1. The Open Group, Berkshire, UK (2011)
3. Rouse, W.B.: A theory of enterprise transformation. Syst. Eng. 8(4), 279–295 (2005)
4. Tamm, T., Seddon, P.B., Shanks, G., Reynolds, P.: How does enterprise architecture add value
to organisations. Commun. AIS 28(1), 141–168 (2011)
5. Abraham, R., Aier, S., Labusch, N.: Enterprise architecture as a means for coordination – an
empirical study on actual and potential practice. In: The 7th Mediterranean Conference on
Information Systems, Paper 33. AIS Electronic Library (2012)
6. Asfaw, T., Bada, A., Allario, F.: Enablers and challenges in using enterprise architecture con-
cepts to drive transformation: perspectives from private organizations and federal government
agencies. J. Enterpr. Architect. 5(3), 18–28 (2009)
7. Greefhorst, D., Proper, E.: Architecture Principles – The Cornerstones of Enterprise
Architecture. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20279-7
8. Westerman, G., Calméjane, C., Bonnet, D., Ferraris, P., McAfee, A.: Digital Transformation:
A Roadmap for Billion-Dollar Organizations. MIT Center for Digital Business and Capgemini
Consulting (2011)
9. Pittl, B., Bork, D.: Modeling digital enterprise ecosystems with ArchiMate: a mobility pro-
vision case study. ICServ 2017. LNCS, vol. 10371, pp. 178–189. Springer, Cham (2017).
https://doi.org/10.1007/978-3-319-61240-9_17
10. Lankhorst, M.: Enterprise Architecture at Work: Modelling, Communication and Analysis,
2nd edn. Springer, Heidelberg (2009)
11. Winter, R., Townson, S., Labusch, N., Noack, J.: Enterprise architecture and transformation:
the differences and the synergy potential of enterprise architecture and business transformation
management. In: Uhl, A., Gollenia, L.A. (eds.) Business Transformation Essentials: Case
Studies and Articles, pp. 219–231. Routledge (2013)
12. Stolterman, E., Fors, A.C.: Information technology and the good life. In: Kaplan, B., Truex,
D.P., Wastell, D., Wood-Harper, A.T., DeGross, J.I. (eds.) Information Systems Research.
IIFIP, vol. 143, pp. 687–692. Springer, Boston, MA (2004). https://doi.org/10.1007/1-4020-
8095-6_45
13. Fitzgerald, M., Kruschwitz, N., Bonnet, D., Welch, M.: Embracing digital technology: a new
strategic imperative. MIT Sloan Management Review, Research Report (2013)
14. Martin, A.: Digital Literacy for the Third Age: Sustaining Identity in an Uncertain World.
eLearning Papers, no. 12, p. 1 (2009)
2 The references of the 19 articles selected for the Systematic Literature Review can be found
in the online appendix at ResearchGate: https://doi.org/10.13140/RG.2.2.33596.18560.
426 M. Hafsi and S. Assar
15. Harmsen, F., Proper, H.A.E., Kok, N.: Informed governance of enterprise transformations.
In: Proper, E., Harmsen, F., Dietz, Jan L.G. (eds.) PRET 2009. LNBIP, vol. 28, pp. 155–180.
Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01859-6_9
16. Boh, W.F., Yellin, D.: Using enterprise architecture standards in managing information
technology. J. Manag. Inf. Syst. 23(3), 163–207 (2007)
17. Radeke, F.: Toward understanding enterprise architecture management’s role in strategic
change: antecedents, processes, outcomes. In: Wirtschaftinformatik, Paper 62, Zuerich (2011)
18. Pulkkinen, M., Naumenko, A., Luostarinen, K.: Managing information security in a busi-
ness network of machinery maintenance services business - enterprise architecture as a
coordination tool. J. Syst. Softw. 80(10), 1607–1620 (2007)
19. Foorthuis, R., Van Steenbergen, M., Mushkudiani, N., Bruls, W., Brinkkemper, S., Bos, R.:
On course, but not there yet: enterprise architecture conformance and benefits in systems
development. In International Conference on IS (ICIS), Paper 110 (2010)
20. Lange, M., Mendling, J., Recker, J.: Realizing benefits from enterprise architecture: a
measurement model. In 20th European Conference on IS (ECIS), Paper 10 (2012)
21. van Gils, B., Proper, H.A.: Enterprise modelling in the age of digital transformation. In:
Buchmann, R.A., Karagiannis, D., Kirikova, M. (eds.) PoEM 2018. LNBIP, vol. 335, pp. 257–
273. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02302-7_16
22. Fayoumi, A.: Toward an adaptive enterprise modelling platform. In: Buchmann, R.A., Kara-
giannis, D., Kirikova, M. (eds.) PoEM 2018. LNBIP, vol. 335, pp. 362–371. Springer, Cham
(2018). https://doi.org/10.1007/978-3-030-02302-7_23
23. Winter, R., Labusch, N.: Towards a conceptualization of architectural support for enterprise
transformation. In: ECIS 2013. AIS Library (2013)
24. Kitchenham, B.: Guidelines for performing systematic literature reviews in software.
Technical Report EBSE-2007-01, UK, Keele University and University of Durham (2007)
25. Okoli, C.: A guide to conducting a standalone systematic literature review. Commun. AIS 37
(2015). https://doi.org/10.17705/1CAIS.03743
26. Ngai, E.W.T., Wat, F.K.T.: A literature review and classification of electronic commerce
research. Inf. Manag. 39(5), 415–429 (2002)
27. Nguyen-Duc, A., Cruzes, D.S., Conradi, R.: The impact of global dispersion on coordination,
team performance and software quality: a systematic literature review. Inf. Softw. Technol.
57, 277–294 (2015)
28. Hauge, O., Ayala, C., Conradi, R.: Adoption of open source software in software intensive
organizations – a systematic literature review. Inf. Softw. Technol. 52(11), 1133–1154 (2010)
29. The Open Group: ArchiMate® 2.0 Specification. The Open Group, Berkshire, UK (2017)
30. Bernus, P., Noran, O.: A metamodel for enterprise architecture. In: Bernus, P., Doumeingts,
G., Fox, M. (eds.) EAI2N 2010. IAICT, vol. 326, pp. 56–65. Springer, Heidelberg (2010).
https://doi.org/10.1007/978-3-642-15509-3_6
31. Chen, Z., Pooley, R.: Requirement analysis for enterprise IS – developing an ontological meta-
model for Zackman framework. In: ICIS Proceedings, Paper 182. AIS Electronic Library
(2009)
32. DoD, Department of Defense: DoD Architecture Framework Version 2.02 (2012). https://
dodcio.defense.gov/Library/DoD-Architecture-Framework/dodaf20_dm2/. Accessed 05 Feb
2020
33. IEEE: IEEE Recommended Practice for Architectural Description of Software Intensive
Systems (IEEE Std 1471-2000), New York, NY (2000)
34. Rabiee, F.: Focus-group interview and data analysis. Proc. Nutr. Soc. 63, 655–660 (2004)
35. Rico, D.F.: Optimizing the ROI of enterprise architecture using real options. In: End User
Computing Challenges and Technologies: Emerging Tools and Applications, Information
Science Reference, Hershey, PA (2007)
Does Enterprise Architecture Support Customer Experience Improvement? 427
36. Forrester: Banking of the future: how banks will use digital capabilities to remain competitive
(2019). https://www.forrester.com/webinar/Banking+Of+The+Future+How+Banks+Will+
Use+Digital+Capabilities+To+Remain+Competitive/-/E-WEB19183. Accessed 05 Feb 2020
37. Gartner: Hype Cycle for Digital Banking Transformation (2019). https://www.gartner.com/en/
documents/3955840/hype-cycle-for-digital-banking-transformation-2019. Accessed 05 Feb
2020
38. Hafsi, M., Assar, S.: Does enterprise architecture support digital transformation endeav-
ors? Questionning the old concepts in light of new findings. In: Proceedings Mediterranean
Conference on Information Systems (MCIS) (2019)
39. Hafsi, M., Asaar, S.; What enterprise architecture can bring for digital transformation: an
exploratory study. In: IEEE 18th Conference on Business Informatics (2016)
40. Hafsi, M., Assar, S.: Managing strategy in digital transformation context: an exploratory
analysis of enterprise architecture management support. In: IEEE 21st Conference on Business
Informatics (2019)
A Formal Basis for Business Model Evaluation
with Linguistic Summaries
(Work-in-Progress Paper)
Rick Gilsing, Anna Wilbik(B) , Paul Grefen, Oktay Turetken, and Baris Ozkan
1 Introduction
Factors such as digitization, globalization and rapid technology change cause evolution
of contemporary markets at an accelerated pace [1, 2]. Although these factors provide
organizations promising opportunities with respect to digital innovation and customer
engagement, organizations increasingly are forced to adapt their current business logic
to enable the adoption of new IT developments and the adherence to shifting customer
needs. It is therefore not surprising that we see the increased prevalence of the business
model concept in IS research [1, 3]. A business model describes the logic of how value
is created and captured, the internal and external resources used to enable value creation
and the organizational and technical architecture deployed to support the business model
[4, 5]. Business models bridge the gap between business strategy [6] and operational
business process models [7] as they concretize strategy and provide the context for the
underlying process models. As such, given their pivotal role in business conceptualiza-
tion and their descriptive and explanatory power, they are often used as a unit of analysis
to understand the impact of IT or digital innovation and to structure its implementation
[1, 8].
The adaptation or innovation of business models to accommodate or integrate digital
innovation is a complex, non-linear design process and requires several iterative design
and evaluation tasks [9]. Normative guidance, technological rules and methodological
support can aid both research and practitioners in understanding or conducting business
model innovation [10]. Although tools and methods have been proposed in research to
support or guide business model design [11–13], limited support is present, particularly
from an engineering or methodological perspective, for the evaluation of business mod-
els [1, 14]. This issue is even more apparent for the early phases of business model
innovation, for which business model design decisions often are high-level in nature and
uncertain [15, 16], resulting in difficulties with respect to quantifying or even merely
assessing the potential risks and outcomes as a part of business model evaluation. As a
result, qualitative evaluation approaches are advocated to support early-phase business
model innovation [17]. Although qualitative techniques such as focus groups or expert
judgment are frequently used [18], these techniques are informal and lack structure to
be systematically applied. On the other hand, we see the use of performance criteria
or metrics as a more formalized approach to qualitative business model evaluation [19,
20]. However, these techniques lack methodological guidance on how these should be
catered to the specific characteristics of business model designs, and they often require
quantitative support to be effectively used.
As a novel technique, we have proposed the use of linguistic summarization as a
means to derive and specify ‘soft’ key performance indicators (SKPIs) that describe
performance characteristics of specific business model designs [21]. These SKPIs are
expressed in soft-quantitative terms, which makes them suitable to support early business
model evaluation, when ‘hard’ quantitative data on a business model is not yet available.
So far, the technique has been proposed in an informal way. To support systematic
application and the development of tooling towards business model evaluation, we make
the next step in this paper: we focus on the formalization of the approach, linking
formal specification of business models and formal specification of the type of linguistic
summaries that we use (intentional linguistic summaries). On this basis, we show how
the formal model is a basis for the development of support for our approach. Accordingly,
the research question for this paper is as follows:
The answer to this question helps bridging the currently existing gap between the fully
qualitative evaluation business models (which relies heavily on intuition of designers)
430 R. Gilsing et al.
and the fully quantitative evaluation of business models (which requires far more data
than is typically available in the early stages of business model design). Bridging this
gap is of interest to both the business model research community and the design and use
of business models in business practice.
The remainder of this research-in-progress paper is structured as follows. In Sect. 2,
we discuss the research background on business models, business model evaluation and
linguistic summarization. Section 3 introduces the running example that we use for the
remainder of this paper to illustrate the application of linguistic summarization. Section 4
details the formalization to our approach. We illustrate how formalization supports the
practical application of our method in Sect. 5 through ILSs with respect to the running
case. Section 6 concludes the paper, expressing the avenues for future work and the
outlook of our research.
2 Related Work
In this section, we describe related work in three fields of research that form the basis for
our work: business model design, business model evaluation and linguistic summaries.
Business Model Design. Business models are increasingly used in IS research as a
means to explore how digital innovations or IT-enabled innovations may impact the
current business logic [1, 11]. Given its pivotal role between business strategy and
operational models [7], the concept of business model often serves as a bridge to sup-
port business-IT alignment. Many componentizations have been proposed to structure
the business model construct [22]. For instance, from an IS perspective, Hedman and
Kalling [3] componentize business models into levels related to the market or environ-
ment, the offerings of the business model, the architectural structure and the resource
deployed. Through detailing each level, organizations obtain a better understanding of
what business logic is followed, how resources can be integrated or deployed and how
this may influence or support customer offerings.
Several tools have been proposed to guide the design of business models. For instance,
Osterwalder and Pigneur [11] propose the widely popular Business Model Canvas
(BMC), which represents a graphical template consisting of nine building blocks that
address various elements of business model design. The BMC takes an organization-
centric, resource-based perspective and focuses explicitly on customer-supplier inter-
actions and relationships. However, we see that as organizations increasingly transi-
tion towards service-orientation and collaborative networks [23–25], tooling towards
networked, service-dominant business model design is proposed. For instance, Zol-
nowski et al. [26] propose the Service Business Model Canvas, which adapts the origi-
nal BMC to accommodate the modelling of service business. Similarly, Grefen [27] and
Turetken et al. [12] describe and evaluate the Service-Dominant Business Model Radar
(SDBM/R), which through its circular template accommodates an explicitly networked
perspective of business model design.
Linguistic Summaries. Linguistic summaries (LS) are statements with a specific for-
mat (template or protoform) that are used to describe data in brief natural language
constructs and that can be automatically generated [33]. LS allow to more easily com-
prehend a set of data [34]. Linguistic data summaries are quantified propositions with
two protoforms (or templates): a simple protoform, Q y’s are P, exemplified by “most
cars are new” and an extended protoform, Q Ry’s are P, exemplified by “most fast cars are
new”. Q is the linguistic quantifier, e.g. most. P is the summarizer, an attribute together
with a linguistic value, e.g. new car. R is an optional qualifier, another attribute together
with a linguistic value, which narrows down the scope of universe, e.g., fast car. Inten-
tional linguistic summaries (ILSs) [21] are quantified statements with the same structure
as linguistic summaries: Q y’s are P and Q Ry’s are P. The main difference is that ILSs
are not created from existing data, but capture intentions that the stakeholders want to
be true. In other words, they specify desired constraints over future data. We use this
construct to specify constraints over future effects of business models.
3 Running Example
solution, the resources of partners such as platform providers, municipality, road author-
ity, parking provider and event location and event providers were integrated. To further
stimulate the financial viability of the collaboration, retailers were involved as they may
significantly benefit from event visitors arriving early in the city. The SDBM/R tech-
nique was used as the tool for business model design [12, 27, 36]. The resulting business
model design to accommodate the solution is presented in Fig. 1. In this business model
radar (which we label TJFERC), we see the central value (value-in-use) of the business
model in the center of the radar and the involved customer (Large City) as one of the
eight involved business parties (the actors in the network) – each having one ‘slice’
of the radar, labeled in the outer ring. Apart from the customer, the orchestrator party
(Mobility Broker) and six other parties are present. A party can be a core party (i.e.,
essential for the functioning of the business model and operation of the offered service)
or an enriching party (i.e., bringing non-essential added value). The three rings around
the central value detail for each party (from the center outwards) the value that each
party contributes to the central value-in-use (its actor value proposition), the activities
it has to perform to create this value (value coproduction activities), and its costs and
benefits (both financial and non-financial). Note that each business model only has a
single value-in-use, which is construed from the set of actor value propositions. As a
consequence, to generate a different value-in-use, a different set of value propositions
would be needed (which in turn results in a different business model design).
Fig. 1. Business model design draft to address event induced traffic challenges in the inner-city.
To support the evaluation of the business model design, we generate ILSs per party of
the business model. ILSs represent operationalized, strategic preferences or summaries
per party that are specifically catered to the business model design. As such, each business
A Formal Basis for Business Model Evaluation with Linguistic Summaries 433
model design, depending on its contents, may result in different ILSs. The ILSs serve
as the basis for communicating under what conditions a party is willing to participate in
the business model. By assessing whether the ILSs can be achieved, the viability of the
business model can be evaluated [21]. The ILSs are presented in a pre-specified structure
(named protoforms), as usual in research into linguistic summarization [34]. Although
the ILSs are initially soft-quantitative in nature, the structure of the summaries allows
the ILSs consequently to be further quantified through concrete membership functions
of the linguistic summaries [33, 34]. We will demonstrate the ILSs for this example in
Sect. 5.
To formalize the SDBM/R concept (which we call business model radar or BMR from
now on for easy readability), we identify that this concept has an overall structure that is
independent from the number of involved parties, and a structure per party. Hence, we
provide the formalization in two steps: the radar and the parties.
A business model radar (BMR) is a business model specification with the following
formal type and constraint:
B M R = name: L , value: V iU, cust: P, or ch: P, par ts: { par t: P, cor e: B O O L}
par ts = ∅
Here, name is the name of the business model from the set of labels L, value is the value
in use of the business model from the set of values-in-use ViU, cust is the customer from
the set of parties P, orch is the orchestrator party from P, and parts is the set of other
parties of type {P, B O O L}, i.e., a set of pairs of parties and an indication whether a
party is a core party in the business model. The structure states that exactly one customer
party is present and exactly one orchestrator party. The additional constraint specifies
that at least one other party must be present – this to make it a true networked business
model and not a dyadic relation.
A BMR instance b therefore has the following format:
A party is the specification of a role in a business model radar with the following type:
P = name: L , avalp: {AV P}, acopa: {AC A}, aben: {AB}, acost: {AC}
avalp, acopa, aben, acost = ∅
434 R. Gilsing et al.
The set avalp contains the set of actor value propositions of a party (a party can have
more than one actor value proposition), acopa the set of actor coproduction activities
(a party can have more than one activity), aben the set of actor benefits, and accost the
set of actor costs. All of the four sets need to be non-empty for a business model to be
viable: each actor needs to contribute to the central value-in-use, each actor needs to
perform at least one activity to generate this contribution, and each actor needs to have
both benefits (its reason to participate in the business model) and costs (not to be a ‘free
rider’ to the other parties).
The above shows that this simple formalization already provides a nice set of cor-
rectness criteria for business models specified in the SDBM/R technique, which can
be automatically checked. These criteria are of a syntactical nature though and specify
nothing about the intended business effects of the business model. To enable this, we
use intentional linguistic summaries.
Here, quant is a soft quantifier of type QF, obj is the set of quantified objects of type
OB, oqual is the set of object qualifications (features) of type OQ, and ochar is the set of
object characteristics (features) of type OC. Object qualification oqual can be a feature
describing all objects in a UoD.
An ISQS instance qs therefore has the following format:
In the above specification, QF is the enumerated set of soft quantifiers, which state the
intended fraction of the set of quantified objects. Usually relational quantifiers are used
(i.e., describing the proportion within the set), like most, indicating above 50%. Seldom,
absolute quantifiers (i.e., referring to the absolute object count) are used, e.g., around 5,
more than 7. An often used set of soft quantifiers is the following, and we will use it in
our work for soft quantification of business models:
Q Fou = {AL L , AL M O ST AL L , M AN Y, S O M E, F E W, AL M O ST N O N E, N O N E}
We use only a part of the expressiveness of the linguistic summaries model to stay
pragmatic. Therefore, we define the elements of QF ou to have a fuzzy ordinal relation
denoted with the fuzzy comparison operator :
AL L AL M O ST AL L M AN Y S O M E F E W AL M O ST N O N E N O N E
A Formal Basis for Business Model Evaluation with Linguistic Summaries 435
The elements of QF indicate the desired proportion of a set, modelled using a fuzzy
set. An actual proportion of a subset may therefore satisfy two adjacent soft quantifiers,
where adjacent is defined by the fuzzy ordinal relation specified above.
The set of quantified objects OB is the powerset of objects in the UoD over which
we want to state soft quantifications:
O B = {{O ∈ U oD}}
ob = {o ∈ U oD}
A feature of an object is a tuple of type F that contains the feature label and the set of
linguistic value labels:
For instance
Linguistic value labels can be made precise and represented as fuzzy sets, with M as the
membership function:
M: O B × F L × L V → [0, 1]
The membership functions do not have to be defined for intentional soft quantified
statements at the early design stage, allowing the linguistic value labels to have more
intuitive definition and meaning and be made more precise in later design stages.
The set of features of an object is given by the function ofeat that takes an object:
o f eat: U oD → {F}
The set of object qualifications OQ consists of pairs of a feature label and a linguistic
value. More complex situations are allowed, where multiple feature labels and linguistic
values can be combined with conjunctions. For pragmatic reasons, we focus only on the
simple case in this work.
O Q: F L × L V
We have a function oqmem which for the sets of objects in the UoD and a feature
combined with a linguistic value identifies subsets of the UoD of which the elements
have the same type, plus a feature label and a feature value:
oqmem: O B × F L × L V → O B
436 R. Gilsing et al.
5.1 Customer
From a customer-oriented perspective, we create a set of ISQS templates that describe the
most important aspects of a business model for evaluation from the customer perspective,
i.e., the value-in-use, the benefits and the costs. Note that based on this template the
respective stakeholder (in this case the customer) can select the objects that are most
appropriate to express its strategic goals or motivation to participate.
Value-in-Use. We create a soft quantification over the value-in-use for the set of cus-
tomers of a business model, stating that the majority of customers indeed receives this
value-in-use:
Note that the value any F, all V for the object qualification function means that all
objects are included. f(viu) is a linguistic label for a feature of the value-in-use.
For the running example of Sect. 3, the value-in-use is traffic-jam free event rich
city. A feature of this value-in-use is the amount of traffic jams and their classification.
Traffic jams can be characterized by, e.g. three linguistic labels into three classes: heavy,
medium and small. In this case, the ISQS can be as follows:
qs0 : Most large cities have few heavy traffic-jams caused by the events.
where most is the quantifier (qf ), large city is the customer (p1 ), and few traffic-jams
caused by the event is the feature label for the value-in-use, and heavy is its linguistic
label.
438 R. Gilsing et al.
Benefits. We create a soft quantification over the benefits for the customer, stating that
desired benefits occur often:
For the running example we use the above template to create the following ISQSs
describing the benefits of the customer (large city):
qs1a = M O ST, large cit y, any F, all V , less tra f f ic − jam, heavy
qs1a
= M O ST, large cit y, any F, all V , mor e events, big
qs1a
Costs. We make a soft quantification over the costs for the customer, stating that
unacceptable costs do not occur often:
The core parties are essential for the functioning of a business model. Consequently,
we make soft quantifications over the costs/benefits for each core party, stating that an
acceptable cost/benefit ratio occurs often:
For the running example we have created a set of example statements. For the parking
provider an ISQS is:
or in a textual format:
qsk1 : Most parking providers have significantly improved planning on most events.
The retailer is mostly focused on the financial aspect, therefore a good ISQS is:
For the visitor the concert experience and memories are the most important, leading us
to the following ISQS:
For the event organizers and the event location providers the focus is also on customer
satisfaction:
qsk4 : All event organizers (location providers) have a high customer satisfaction on most
events.
Again, each stakeholder can change the set of objects of the introduced templates to
generate ISQSs that express its strategic motives or goals. Please note that in the sum-
maries presented above, the focus is on the stakeholder, e.g., the summaries describe the
retailers, visitors and event organizers. A different set of summaries can be obtained, if
we put the operation, in this case an event, in the focus of linguistic summaries. Currently
we are working on normative guidance towards what level of the operation should be
used as focus of the linguistic summaries, given the preferences of stakeholders and the
context of the BMR.
Given all the above ingredients for the formal representation of example from Sect. 3,
we can specify the soft-quantified business model:
from a soft-quantified perspective if all ISQSs for that BMR are above the fuzzy ‘truth
value’ T ∈ M, where T can be chosen depending on the ‘strictness’ of business model
evaluation:
If for example, all the statements we have generated are evaluated as at least “rather
feasible”, the model is judged to be valid for all stakeholders. For instance, if by means
of collaborative discussions, all stakeholders determine that the linguistic summary “all
retailers makes an acceptable profit on most events” is rather feasible, the business
model design for the retailer is considered feasible. If all generated linguistic summaries
are feasible, the business model design can progress to the next phase (integration), in
which the design is further concretized and quantified in a more traditional way.
expressed by means of ISQSs, as well as providing rules with respect to the generation of
ISQSs. Currently, an almost infinite set of ISQSs can be generated per party in a business
model, which may inhibit the usability and interpretability of the outcomes. Therefore,
we will assess which ISQS templates should be generated under which circumstances,
or how the strategic preferences of a stakeholder can be captured through a limited set
of ISQSs. Moreover, we also aim to validate our method further to understand the initial
usability, usefulness and ease-of-use of the proposed method.
Future research will build upon this formalization. One promising approach is to use
the formalization as a basis for developing automated tooling for business model evalu-
ation. The formalisms presented in this paper can rather straightforwardly be translated
into a data model and a rule base for such a tool. A second approach is the development
of a directive on how different evaluation methods can be formalized to accommodate
business model evaluation.
References
1. Veit, D., et al.: Business models. Bus. Inf. Syst. Eng. 6(1), 45–53 (2014)
2. Gambardella, A., Mcgahan, A.: Business-model innovation: general purpose technologies
and their implications for industry structure. Long Range Plann. 43, 262–271 (2010)
3. Hedman, J., Kalling, T.: The business model concept: theoretical underpinnings and empirical
illustrations. Eur. J. Inf. Syst. 12(1), 49–59 (2003)
4. Al-Debei, M., Avison, D.: Developing a unified framework of the business model concept.
Eur. J. Inf. Syst. 19(3), 359–376 (2010)
5. Zott, C., Amit, R.: Business model design: an activity system perspective. Long Range Plann.
43(2–3), 216–226 (2010)
6. Shafer, S., Smith, J., Linder, J.: The power of business models. Bus. Horiz. 48, 199–207
(2005)
7. Al-Debei, M., El-Haddadeh, R., Avison, D.: Defining the business model in the new world
of digital business. In: AMCIS 2008 Proceedings (2008)
8. Teece, D.: Business models, business strategy and innovation. Long Range Plann. 43, 172–194
(2010)
9. Sosna, M., Trevinyo-Rodriguez, R., Velamuri, S.: Business model innovation through trial-
and-error learning: the naturhouse case. Long Range Plann. 43(2–3), 383–407 (2010)
10. Bucherer, E., Eisert, U., Gassmann, O.: Towards systematic business model innovation:
lessons from product innovation management. Creat. Innov. Manag. 21(2), 183–198 (2012)
11. Osterwalder, A., Pigneur, Y.: Business Model Generation: A Handbook for Visionaries, Game
Changers, and Challengers. Wiley, Hoboken (2010)
12. Turetken, O., Grefen, P., Gilsing, R., Adali, O.: Service-dominant business model design for
digital innovation in smart mobility. Bus. Inf. Syst. Eng. 61(1), 9–29 (2019)
13. Gordijn, J., Akkermans, H.: Designing and evaluating E-business models. IEEE Intell. Syst.
Appl. 16(4), 11–17 (2001)
14. Simmert, B., Ebel, P., Peters, C., Bittner, E., Leimeister, J.: Conquering the challenge of
continuous business model improvement. Bus. Inf. Syst. Eng. 61(4), 451–468 (2019)
15. McGrath, R.: Business models: a discovery driven approach. Long Range Plann. 43(2–3),
247–261 (2010)
16. Zott, C., Amit, R.: Business model innovation: toward a process perspective, pp. 395–406
(2015)
442 R. Gilsing et al.
17. Tesch, J., Brillinger, A.: The evaluation aspect of digital business model innovation: a literature
review on tools and methodologies. In: 25th European Conference on Information Systems,
pp. 2250–2268 (2017)
18. Bocken, N.M.P., Antikainen, M.: Circular business model experimentation: concept and
approaches. In: Dao, D., Howlett, R.J., Setchi, R., Vlacic, L. (eds.) KES-SDM 2018. SIST, vol.
130, pp. 239–250. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04290-5_25
19. Heikkila, M., Bouwman, H., Heikkila, J., Solaimani, S., Janssen, W.: Business model metrics:
an open repository. Inf. Syst. E-bus. Manag. 14(2), 337–366 (2016)
20. Diaz-Diaz, R., Muñoz, L., Pérez-González, D.: The business model evaluation tool for smart
cities: application to SmartSantander use cases. Energies 10(3), 262 (2017)
21. Wilbik, A., Gilsing, R., Grefen, P., Turetken, O., Ozkan, B.: Intentional linguistic summaries
for collaborative business model radars. In: WCCI (2020)
22. DaSilva, C., Trkman, P.: Business model: what it is and what it is not. Long Range Plann.
47(6), 379–389 (2014)
23. Kindström, D.: Towards a service-based business model – key aspects for future competitive
advantage. Eur. Manag. J. 28(6), 479–490 (2010)
24. Gebauer, H., Fleisch, E., Friedli, T.: Overcoming the service paradox in manufacturing
companies. Eur. Manag. J. 23(1), 14–26 (2005)
25. Grefen, P., Turetken, O.: Achieving business process agility through service engineering in
extended business networks. In: BPTrends (2018)
26. Zolnowski, A., Weiß, C., Böhmann, T.: Representing service business models with the service
business model canvas - the case of a mobile payment service in the retail industry. In: 47th
Hawaii International Conference on System Science, pp. 718–727 (2014)
27. Grefen, P.: Service-Dominant Business Engineering with BASE/X: Business Modeling
Handbook. Amazon CreateSpace (2015)
28. Schrauder, S., Kock, A., Baccarella, C., Voigt, K.: Takin’ care of business models: the impact
of business model evaluation on front-end success. J. Prod. Innov. Manag. 35(3), 410–426
(2018)
29. Mateu, J., March-Chorda, I.: Searching for better business models assessment methods.
Manag. Decis. 54(10), 2433–2446 (2016)
30. Moellers, T., Von der Burg, L., Bansemir, B., Pretzl, M., Gassmann, O.: System dynamics for
corporate business model innovation. Electron. Mark. 29, 387–406 (2019). https://doi.org/10.
1007/s12525-019-00329-y
31. Daas, D., Hurkmans, T., Overbeek, S., Bouwman, H.: Developing a decision support sys-
tem for business model design. Electron. Mark. 23, 251–265 (2013). https://doi.org/10.1007/
s12525-012-0115-1
32. Schoormann, T., Kaufhold, A., Behrens, D., Knackstedt, R.: Towards a typology of approaches
for sustainability-oriented business model evaluation. In: Abramowicz, W., Paschke, A. (eds.)
BIS 2018. LNBIP, vol. 320, pp. 58–70. Springer, Cham (2018). https://doi.org/10.1007/978-
3-319-93931-5_5
33. Yager, R.R.: A new approach to the summarization of data. Inf. Sci. 28(1), 69–86 (1982)
34. Kacprzyk, J., Zadrożny, S.: Linguistic database summaries and their protoforms: towards
natural language based knowledge discovery tools. Inf. Sci. 173(4), 281–304 (2005)
35. Grefen, P., Turetken, O., Traganos, K., den Hollander, A., Eshuis, R.: Creating agility in traffic
management by collaborative service-dominant business engineering. In: Camarinha-Matos,
L.M., Bénaben, F., Picard, W. (eds.) PRO-VE 2015. IAICT, vol. 463, pp. 100–109. Springer,
Cham (2015). https://doi.org/10.1007/978-3-319-24141-8_9
36. Gilsing, R., Turetken, O., Adali, O.E., Grefen, P.: A reference model for the design of service-
dominant business models in the smart mobility domain. In: International Conference on
Information Systems (2018)
Author Index