0231 - PM Internship Report
0231 - PM Internship Report
INTERNSHIP
BACHELOR OF TECHNOLOGY
in
A. Hema Harshitha
214G1A0231
2024-2025
Department of Electrical and Electronics Engineering
Certificate
This is to certify that the internship report entitled Process Mining Virtual Internship is the
bonafide work carried out by A.HEMA HARSHITHA bearing Roll Number
214G1A0231 in partial fulfilment of the requirements for the award of the degree of Bachelor
of Technology in Electrical and Electronics Engineering for three months from April 2024
to June 2024.
The All India Council for Technical Education (AICTE) is the statutory body and a national
level council for technical education, under the Department of Higher Education, Ministry of
Education, Government of India. AICTE is responsible for proper planning and coordinated
development of the technical education and management education system in India.
EduSkills Foundation is a non-profit organization that enables Industry 4.0 ready digital
workforce in India. EduSkills works closely with students, faculties, educational institutions,
and central/state governments to provide world-class curriculum access, industry exposure,
skill development, and career opportunities.
Purpose: The purpose of AICTE and EduSkills is to improve the quality and relevance of
technical education in India and to create a skilled and employable workforce for the emerging
industries. AICTE and EduSkills have launched several initiatives to provide virtual internship
opportunities for students in technical institutions across the country. The aim of these
initiatives is to bridge the gap between academia and industry by ensuring practical learning
and hands-on experience for the students.
The satisfaction and euphoria that accompany the successful completion of any task
would be incomplete without the mention of people who made it possible, whose constant
guidance and encouragement crowned our efforts with success. It is a pleasant aspect that I have
now the opportunity to express my gratitude for all of them.
I also express our sincere thanks to the Management for providing excellent facilities
and support.
Finally, I wish to convey my gratitude to my family who fostered all the requirements
and facilities that I need.
A. Hema Harshitha
214G1A0231
Contents
Contents Page No
Chapter 1: Introduction 1-2
1.1 Introduction to process Mining 1
1.2 History of Process Mining 2
Chapter 2: Foundations of Process Mining 3
2.1 Features of Process Mining 3-5
2.2 Tools for process mining 5
Chapter 3: Celonis Process Mining: Unveiling Operational Insights 6-13
3.1 Introduction 6
3.2 Celonis Academy 6
3.2.1 Discovery 7
3.2.2 Conformance 7
3.2.3 Enhancement 7
3.3 Registration for Celonis Academy 8
3.4 Level 1: Introduction to Process Mining 9
3.5 Level 2: Process Mining Fundamentals 10
3.6 Level 2: Celonis Rising Star – Technical 12
3.6.1 Write PQL Queries 12
3.6.2 Get data into EMS 13
Chapter 4: Celonis Execution Management System (EMS) 14-23
4.1 Studio 15
4.2 Process Analytics 17
4.3 Process Explorer 18
4.3.1 Basic Configuration 19
4.4 Variant Explorer 19
4.5 Case Explorer 20
4.6 The Celonis Process Query Language 21
4.6.1 Language Overview 22
Chapter 5: Uses Cases of Process Mining 24
Chapter 6: Applications of Process Mining 25-26
Chapter 7: Learning Outcomes 27
Conclusion 28
Internship certificate 29
References 30
List of Figures
AI Artificial Intelligence
IT Information Technology
P2P Purchase-to-Pay
Chapter 1
Introduction
In the digital tapestry of organizations, every action, decision, and interaction leaves a
trace— a digital footprint that is frequently hidden by complexity. Welcome to the world
of process mining, where the ordinary transforms into a symphony of revelations,
showing the rhythmic pattern of actions that creates our modern-day reality. Processes
silently coordinate the flow of work, resources, and information in the busy corridors
of businesses and institutions. These procedures control effectiveness, quality, and
success in a variety of industries, including
• e-commerce order fulfilment.
• Patient care, allowing hospitals to streamline workflows for better medical
service.
• Supply chain processes, enhancing coordination and reducing delays.
Process Mining is often described as a technique that analyzes and tracks business
processes based on the data available in the information systems of an organization.
Process mining can help to discover, monitor, improve, and optimize the performance
and efficiency of the processes. Process Mining is the combination of two disciplines:
Data Science and Business Process Management. Process Mining essentially uses Data
Science techniques, such as Big Data and AI, to address Process Science problems such
as process improvement and automation
Process mining was first introduced as a research discipline by Wil van der Aalst and
his colleagues in the early 2000s. The initial focus was on developing techniques to
extract process models from event logs and on discovering patterns and bottlenecks in
process data.
The history of process mining dates back to the early 2000s when researchers began
developing techniques to analyze event data and extract process models from event logs.
Here are some of the key milestones in the history of process mining :
• 1999: The term “Process mining” was first coined in a research proposal written by
the Dutch computer scientist Wil van der Aalst.
• 2002: The first paper on process mining was published by Wil van der Aalst and his
colleagues.
• 2004: The first commercial process mining tool was released by Fluxicon.
• 2011: The IEEE Task Force on Process Mining was established.
• 2012: The first Process Mining Manifesto was published.
Chapter 2
Many organizations and people think that process mining is a tool to model processes.
But the fact is process mining is not a tool it is an analytical method. It is not modelling
something into a process but it can be used to understand the real process. There are
several tools available for process mining, each offering various features and
capabilities to support different aspects of the process mining lifecycle. Here are some
notable process mining tools: ProM, Disco, Celonis, Prometheus, RapidMiner, Minit,
QPR ProcessAnalyzer, Pafnow, ProcessGold, UiPath Process Mining.
Chapter 3
Celonis Process Mining: Unveiling Operational
Insights
3.1 Introduction:
For this virtual internship program, to understand and implement process mining
we have used Celonis software as a platform. Celonis is a powerful and capable process
mining suite that collects and analyses IT data in order to generate actionable insights.
It is used to identify and fix operational flaws, making the overall operation more
effective. Visual reporting is used by Celonis to help find problems in existing
processes. It creates a process flowchart of the company's processes by tracing any IT
supported activities. After that, they will create models for the best solutions and the
various variants that are currently being used. You can view the complete company's
operations in real time, including all active processes.
Celonis Academy offers training courses on process mining. The courses are designed
to help learners become Celonis experts by providing best-in-class instructor-led
training and goal-based training tracks coupled with hands-on courses. The vast library
of 2,000 hours of online training content is accessible around the clock and free of
charge. The courses cover the fundamentals of process mining in the first week and
advanced concepts of process mining in the second week. The course will provide an
understanding and practice of the three types of Process Mining: Discovery,
Compliance, and Enhancement¹.
3.2.1 Discovery:
Process discovery is the initial phase of process mining. Transforming the event log into
a process model is the primary objective of process discovery. Any data storage system
that keeps track of organisational operations and their associated timestamps can
produce an event log. Such an event log must include a case id, an activity description,
and the timestamp at which the action was taken. Typically, a process model that is
representational of the event log is the end result of process discovery. Such a process
model may be found, for instance, by employing methods like the alpha algorithm,
heuristic mining, or inductive mining.
3.2.2 Conformance:
Conformance assists in analysing the differences between an event log and an existing
process model. A discovery algorithm or human construction can both be used to create
such a process model. Conformance checking can be used to evaluate the discovery
techniques, detect deviations, or improve an existing process model. For each option,
the event log is examined to determine whatever data is generally accessible at the time
the decision is taken. Then, traditional data mining techniques are employed to
determine which data items have an impact on the decision. A decision tree is thus
created for each option in the procedure.
3.2.3 Enhancement:
Process enhancement is the strategy that uses the insights from conformance checking
to propel continuous process improvement initiatives, driving efficiency gains and
customer satisfaction. This type of process mining has also been referred to as
extension, organizational mining, or performance mining. In this class of process
mining, additional information is used to improve an existing process model. For
example, the output of conformance checking can assist in identifying bottlenecks
within a process model, allowing managers to optimize an existing process.
Then to get the Celonis Process Mining Fundamentals certificate, we have to complete
3 levels in the training track namely- Level 1: Introduction to Process Mining Level 2:
Process Mining Fundamentals
This training track gives us the idea of what process mining is and the basics
of how it works. As per the theoretical foundations of Celonis, Process Mining is the
combination of two disciplines: Data Science and Business Process Management.
Process Mining essentially uses Data Science techniques, such as Big Data and AI, to
address Process Science problems such as process improvement and automation.
Process mining accomplishes this union by reconstructing and visualizing process flows
using the digital footprints left behind by IT systems. From this point, the technology
behind process mining can spot trends and deviations and eventually get rid of
bottlenecks. We will now look more closely at what is needed to reconstruct a process
in this manner.
When it comes to the topic of digital footprint, A digital footprint refers to the trail of
data you leave behind when using the internet. It includes websites you visit, emails
you send, and information you submit online. A digital footprint can be used to track a
person’s online activities and devices. Internet users create their digital footprint either
actively or passively. Whenever you use the internet, you leave behind a trail of
information known as your digital footprint¹.
So, these digital footprints should be retrieved in order to structure them and use them
in the process mining. Here where the event logs play a key role. Event Logs are the
format in which we can retrieve our digital footprints from the underlying IT systems.
They're essentially the log books that IT systems keep to record what events take place
for each Case ID and at what time.
This training track provides you with insights into both the theoretical and
applied foundations around Process Mining. The track is structured into three
milestones which consist of multiple courses. Those milestones are:
2. Build Analyses
The track contains a good selection of academic reading, software training, and
application examples of where the software is used in real life. It is also accompanied
by quizzes and software exercises to test your knowledge and skills in the software.
Department of Electrical and Electronics Engineering Page 9
Process mining virtual internship
The Review and Interpret Analyses training track is designed for data
and business analysts, process experts, and process improvement specialists. Keep
in mind, this track is focused mainly on product know-how and less so on business
acumen. If you'd like to complement your own experience in strategically identifying
and prioritizing process inefficiencies, and planning for and implementing
improvement measures, then we recommend you take a look at the Deliver Business
Value with Celonis training track after completing this one.
Here's a sneak peak of what you'll experience in the Review and Interpret Analyses
training track.
• Add the Variant and Process Explorers to make the process and all its activities
transparent.
• Describe the relationship between data tables and dimensions and KPIs.
• Configure tables and charts with dimensions and KPIs so that users can drill down
into the analysis.
• Configure single KPIs such as the number with a KPI to give users quick snapshots
of the health of the process.
• Add Dropdowns and configure them with dimensions, and add Date Pickers to
allow users to restrict the analysis to their desired time period.
• Use the Visual Editor to customize Standard Process KPIs and even build KPIs
from scratch using the visual formula builder.
• Add the Conformance checker sheet to the analysis and even add custom KPIs to
it.
• Create background filters (layers) at component, sheet, and Analysis-level that the
end user cannot remove
• Add a dropdown to an OLAP table so that the end user can select the Dimension
to display from a list.
In the case study part, we are looking at the digitization journey of the Pizzeria Mamma
Mia from the perspective of Giovanni, the owner of the business, and Martin, his Junior
Manager. The Order-to-Cash process (O2C), which is the foundational procedure of the
Munich-based company, will be the subject of attention. The journey begins with the
digitalization of all process steps, continues with the identification of inefficiencies and
bottlenecks, and concludes with recommendations for both immediate improvements
and for maintaining the success of the firm.
quality of the process. High-level key performance measures like sales figures, costs,
and customer happiness are then improved as a result of these.
The PQL is a domain-specific language tailored towards a special process data model
and designed for business users. PQL enables the user to translate process-related
business questions into queries, which are then executed by a custom-built query
engine. PQL covers a broad set of operators, ranging from process-specific functions to
aggregations and mathematical operators. Its syntax is inspired by SQL, but specialized
for process-related queries. Even though Celonis PQL is inspired by SQL, there are
major differences between the two query languages.
• Celonis PQL does not support all operators that are available in SQL.
• Second, Celonis PQL is not supported by a data manipulation language (DML).
• Furthermore, Celonis PQL does not provide any data definition language (DDL).
• In contrast to SQL, Celonis PQL is domain-specific and offers a wide range of
Process Mining operators not available in SQL.
Fig
3.5 Celonis PQL Engine
Department of Electrical and Electronics Engineering Page 12
Process mining virtual internship
In the set up a data pipeline again divide into sub parts they are-
2. Connect to Systems
3. Extract Data
4. Transform Data
In the Refine your Data Pipeline divide into parts they are-
Chapter 4
Process Analytics by Celonis EMS provides you with insights into how your process
is. With Process Discovery, process visualizations show you what’s happening in your
processes; so, you can quickly identify opportunities to increase revenue, free capital,
and ensure customer satisfaction.
4.1 Studio:
The Studio allows you to combine functional expertise with the power of Celonis
Process Mining to create scalable Apps. Celonis Studio is your one-stop development
platform to build, test, and edit Execution Apps and Instruments in a single, low-code
interface.
Your business users can access your apps by clicking on the left-hand navigation menu
and selecting "Apps".
The Studio is structured by your Apps, also called Packages. Packages are collections
of Views, Knowledge Models, Skills and Analyses.
Views are the key user experience component for new Apps built on Celonis. They
empower your business users with a unified interface to consume data, create insights
and act on knowledge.
A View is a collection of components and tools to provide business users with focused
access to our business context and engines. Learn how to configure and use Views.
Department of Electrical and Electronics Engineering Page 15
Process mining virtual internship
Knowledge Models
Knowledge Models act like a centralized place to consistently and reliably define
metrics through configuring Business Knowledge Entities. Business Knowledge
Entities are concrete definitions of KPIs, Benchmarks, Variables, Filters, and many
more. Standardizing the development and definition of these commonly used Business
Knowledge Entities solves a crucial scalability problem, ensures consistency across the
enterprise, and accelerates optimization efforts through the reuse of commonly used
Business Knowledge Entities. Learn how to build and use Knowledge Models.
Skills
A Skill is the integral part of each automation as it defines its procedure by a sequence
of events. It consists of Sensors and Actions. Learn how to make use of automation in
your Execution App.
Analysis
An Analysis helps to identify execution gaps in your process. Within a Package, you
can create Analyses as you might know from Process Analytics. Currently, there are
slight differences between the analysis service offered in Process Analytics and the
analysis service that is offered in the Package. Creating analyses in the studio has the
benefit of versioning and better integration with other services offered by Celonis. It is
now possible to keep multiple analyses with different data models next to each other.
To create an analysis in a package, you need to hover on the package name, click on the
“+” sign, and select the “Analysis” option. You can then define the name of the analysis
and the data model variable to assign the analysis to a data model. The Analysis Key is
used to link other assets such as views to this Analysis.
Folders
Folders can be used inside packages for organizational purposes. Inside a package, you
can create as many folders as you want, and you can directly create new assets inside
the folders or move existing assets inside folders with drag and drop. Furthermore, you
can create folders inside folders, however, you cannot have more than nine levels of
folder hierarchies. This is limited for usability purposes. Also, it is not possible to set
permissions on folders, for this capability, see Studio Spaces or Studio Packages.
Creating a package
Please note that the package name can be changed later, however, the package key as a
unique identifier cannot be changed after being defined.
Also, the package type cannot be changed after creating the package. You can find
information on your package by clicking on the three-dot menu next to the operational
app and then select settings.
Process Analytics is a feature of the Celonis (EMS) that provides insights into
how your process is. With Process Discovery, process visualizations show you what’s
happening in your processes; so you can quickly identify opportunities to increase
revenue, free capital, and ensure customer satisfaction. Process Analytics by Celonis
EMS provides you with insights into how your process is.
To check, look for the Process Analytics icon in the navigation bar.
The Process Explorer is an analysis tool to use when taking an exploratory approach. It
starts with showing the most frequent activities and connections. You can add further
additional activities to the graph and analyze their impact on the process. Any number
of Event Logs can be visualized together to understand the relationships between
multiple processes.
Similar to the Variant Explorer, the Process Explorer allows you to filter on activities
and relationships. Although the Activity and Connection panels are not a part of Process
Explorer, activities and connections can still be used to filter on cases. The experience
of filtering will be comparable to that of the Variant Explorer.
There are two main configuration options for the process explorer:
• Basic Configuration
• Advanced Configurations
To configure a basic process explorer, you must define the event logs you wish to
visualize in the component. Ensure that prior to creating a process explorer component
you have the event logs defined in the eventLogsMetadata section of the Knowledge
Mode.
With the aid of the Celonis EMS Analysis tool Variant Explorer, you can
investigate the movement of a particular process inside your company. Each process
variant would represent a potential path if we imagined a process to be a road trip. Each
step in a process may be compared to a waypoint on a route, and the connections
between steps could be compared to the roads that link the stops. Each journey a person
takes along a specific path would also constitute a case.
Using Variant Explorer, you can see the individual activities within each process
variant and the frequency of each variant. You can also compare variants to each other
and see metrics for individual variants, such as Activity Frequency .
In short, Variant Explorer gives you a quick way to see whether most process cases
follow an acceptable flow of activities or not and helps you develop your first analysis
questions.
The Case Explorer is one of the default analysis screens of Celonis Execution
Management System (Celonis EMS). The Case Explorer displays the data tables
connected to the Celonis engine and is an intuitive tool for examining cases and their
respective activities.
The data provided by the data sets are organized by the Case Explorer and presented as
a table.
1. Column sort: each column represents one of the imported columns from
the data set. When clicking in a column, it is possible to sort the entire
table according to the column content in ascending or descending order.
3. Case selector: each row of the Case Explorer table refers to a specific
case. The case details panel opens when clicking on a row.
4. Activity table divisor: line determining the division between the core
elements of the activity table (case ID, activity, and timestamp) located to
the left and the accessory elements from the activity table or other linked
tables, to the right.
The Table Columns tab allows you to select which columns will be displayed on the
Case Explorer table.
3. Column selector: each name on the list refers to a column on the case
explorer table. If one or more columns are selected, the Case Explorer
table will display only display these together with the core activity table
columns (case ID, activity, and timestamp).
4. Reset: resets the column selection and reverts to the default view.
Operators usually create and return a single column that is either added to an existing
table (e.g. the case or activity table) or to a new, temporary result table. Only a few
operators (e.g. for computing a process graph) create and return one or more tables with
multiple columns. However, these operators are only used internally by GUI
components and are not exposed to the end-user. Currently, Celonis PQL provides more
than 150 different operators to process the event data. Due to space limitations, we
cannot sketch the full language. However, we can offer a brief overview of the major
Even though Celonis PQL is inspired by SQL, there are major differences between the
two query languages. Figure 6 shows these differences by comparing how to query the
cases and the number of involved departments for all orders with a value of more than
1000 euros in both languages. Furthermore, it also illustrates the key concepts of
Celonis PQL.
In contrast to SQL, Celonis PQL does not require the user to define how to join
the different tables within the query. Instead, it implicitly joins the tables according to
their foreign key relationships which have to be defined only once in the data model.
Also, the grouping clause is not needed in. A Query Language for Process Mining
Celonis PQL as each selected column which is not aggregated (i.e. a dimension) is
implicitly used as a grouper. According to the design goals, implicit joins and groupings
significantly reduce the size and complexity of the queries and make it much simpler to
formulate them.
Both languages offer the possibility to filter rows. While SQL requires the user
to formulate the filter condition in the WHERE clause of the query, Celonis PQL offers
the FILTER statements which are separated from the TABLE statements but executed
together. Splitting the data selection and the filters into different statements enables the
user to define multiple filter statements in different locations inside an application,
which then can be combined into the table statement to query the data. Beyond this
simple structure, Celonis PQL provides a wide range of different operators which can
be combined to answer complex business questions. The following list gives an
overview of the most important classes of operators.
Aggregations.
Celonis PQL offers a wide range of aggregation functions, from simple standard
functions like count and average, to more advanced aggregations like standard deviation
and quantiles. Most of the aggregation functions are also available as window-based
functions computing the aggregation not over all values but over a user-defined sliding
element window.
Data functions.
These are operators like REMAP_VALUES and CASE WHEN which allow for
conditional changes of values.
These functions enable the user to modify, project or round a date or time value, e.g.
add a day to a date or extract the month from a timestamp. There are also functions to
compute date and time difference (e.g. between timestamps of events).
Index functions.
There are various machine learning functions available, e.g., to cluster data using the
kmeans algorithm or learn decision trees.
Math functions.
Celonis PQL offers a wide range of mathematical functions, e.g., for arithmetic
computations, rounding float numbers, and computing logarithms.
Chapter 5
Use cases of Process Mining
1. Process Discovery: Process mining helps visualize and understand real-world
processes based on event logs. This aids in identifying bottlenecks,
inefficiencies, and variations in the processes.
4. Root Cause Analysis: Process mining enables the identification of root causes
behind bottlenecks, delays, or errors within processes. This information is
crucial for making targeted improvements.
Chapter 6
Applications of Process Mining
Process mining has a wide range of use cases across various industries and
sectors, showcasing its versatility in uncovering insights and optimizing business
processes. Here are some notable use cases of process mining:
2. Order-to-Cash Analysis:
Process mining can provide end-to-end visibility into supply chain processes,
helping organizations identify delays, stockouts, and inefficiencies in procurement,
production, and distribution.Enhanced supply chain management can lead to reduced
costs, improved inventory control, and optimized resource allocation.
Process mining can map the customer journey across various touchpoints,
helping organizations understand customer interactions and behaviors. Insights can
drive personalized marketing strategies and improvements in customer service.
7. IT Incident Management:
These use cases highlight the diverse ways in which process mining can be applied to
uncover insights, optimize processes, and drive operational excellence across industries.
Chapter 7
Learning Outcomes
After completing this course, you ought to be able to:
Conclusion
In conclusion, process mining is a strong and adaptable technique that delivers useful
insights into the inner workings of organizational processes. Process mining reveals
hidden patterns, reveals inefficiencies, and offers practical suggestions for process
optimization by examining event data produced during the execution of processes. A
number of different sectors, including manufacturing, healthcare, banking, logistics,
and customer service, could benefit from this technology.
Process mining techniques are anticipated to become even more advanced and
integrated with other data-driven methodologies as technology develops, thereby
enhancing their capacity to promote process excellence. However, a thorough grasp of
both the technology and the underlying business processes is necessary for the
successful deployment of process mining. By utilizing the potential of data-driven
insights to continuously improve their operations and reach higher levels of efficiency
and effectiveness, businesses that adopt process mining stand to earn a competitive
edge.
Internship Certificate
References
1. Badakhshan, P., Geyer-Klingeberg, J., El-Halaby, M., Lutzeyer, T. and Affonseca,
G.V.L., 2020, September. Celonis Process Repository: A Bridge between Business
Process Management and Process Mining. In BPM (PhD/Demos) (pp. 67-71).
2. Vogelgesang, T., Ambrosy, J., Becher, D., Seilbeck, R., Geyer-Klingeberg, J. and
Klenk, M., 2021. Celonis PQL: A query language for process mining. In Process
Querying Methods (pp. 377-408). Cham: Springer International Publishing.
3. Van Der Aalst, W., 2012. Process mining: Overview and opportunities. ACM
Transactions on Management Information Systems (TMIS), 3(2), pp.1-17.
4. Reinkemeyer, L., 2020. Process mining in action. Process Mining in Action
Principles, Use Cases and Outloook.
5. Turner, C.J., Tiwari, A., Olaiya, R. and Xu, Y., 2012. Process mining: from theory
to practice. Business Process Management Journal, 18(3), pp.493-512.
6. https://en.wikipedia.org/wiki/Process_mining
7. https://docs.celonis.com/en/celonis-documentation.html