Chapter 3
Chapter 3
Chapter Three
Healthcare Data Analytics
HI and Telemedicine
29/10/2014 E.C BY: MATIWOS T.
1
7/18/2022
…
Information Knowledge
Meaningful data or facts from Application of data and information
which conclusions can be drawn by Information that is justifiably
humans or computers considered to be true
Provides the “ who, what, when,
Provides the “how”
where”
HI and Telemedicine
29/10/2014 E.C BY: MATIWOS T.
Wisdom
Critical use of knowledge to make intelligent decisions
Assimilates knowledge and experience and allows us to be judgmental
about the right kind of action in a given situation
Provides the “why”
2
7/18/2022
DIKW pyramid
DIKW Pyramid
Wisdom
Knowledge
Information
• Provides meaning
• Objective in nature
Data
• Abundant
• Meaningless
3
7/18/2022
4
7/18/2022
Example 1
Problem: patient B is allergic to penicillin. He was recently prescribed amoxicillin for his sore
throat.
Solution:
Data: Penicillin, amoxicillin, sore throat
Information:
Patient B has penicillin allergy
Patient B was prescribed amoxicillin for his sore throat
Knowledge:
Patient B may have allergic reaction to his prescription
Wisdom:
Patient B should not take amoxicillin!!!
5
7/18/2022
Outlines
Introduction Data
Definition Types
Types Data inconsistency
Steps Data dictionaries
The Analytics Pipeline Common terms used in statistical
Machine learning analytics
Types Challenges of data analytics
Steps
Introduction
One of the promises of the growing critical mass of clinical data
accumulating in electronic health record (EHR) systems is secondary use
(or re-use) of the data for other purposes, such as
◦ Quality improvement and
◦ Clinical research
The analysis of this data is usually called analytics (or data analytics)
6
7/18/2022
“Information is the
oil of the 21st century
and analytics is the
combustion engine” Peter Sondeegard
HI and Telemedicine
29/10/2014 E.C BY: MATIWOS T.
What is analytics ?
7
7/18/2022
Types of Analytics
• Descriptive –describe current situations and problems
• Uses business intelligence and data mining to ask “What has
Three levels happened?”
of analytics, • Predictive – simulation and modeling techniques that identify
trends and foreshadow outcomes of actions taken
each with • Uses statical models and forecasts to ask ”What could
increasing happen?”
• Prescriptive – optimizing clinical, financial, and other outcomes
functionality • Uses optimization and simulation to ask “What should we
and value: do?” By: IBM
• Diagnostic: examine data to answer “Why did it happen” By:
Gartner
8
7/18/2022
Descriptive analytics
Describe the data
Common statistics:
◦ Counts
◦ Averages
Typical reporting methods:
◦ Tables
◦ Pie charts
◦ Bar charts
9
7/18/2022
Diagnostic analytics
•Attempts to answer “Why did it happen”
•Tools used:
• Drill-down techniques
• Data discovery
• Correlations
Predictive analytics
Predicts instead of describing or classifying
Rapid analysis
Relevant insight
Attempting to optimize health
and financial outcomes
10
7/18/2022
Prescriptive analytics
Examines data or content to answer the question “What should be
done?” or what can we do to make something happen
Is characterized by techniques such as
◦ Graph analysis
◦ Simulation
◦ Neural networks
◦ Machine learning
11
7/18/2022
12
7/18/2022
What data
elements such as
data of birth, Where are theses Is there a clinical Who is the
gender, data elements data warehouse contact person
medication, located
laboratory results,
etc. are needed
Retrieval
Method for cross checking
number of records as well
as completeness_ how
many should we expect
and did we get everything?
13
7/18/2022
The data are Checked for Errors corrected, Data synchronized Data imported to
retrieved completeness empty fields ”transformed” e.g the destination
addressed M, F, U Vs 1,2,3 system
14
7/18/2022
15
7/18/2022
16
7/18/2022
…
The pipeline begins with input data sources, which in healthcare and
biomedicine may include
◦ Clinical records
◦ Financial records
◦ Genomics and related data, even those from outside the healthcare
setting (e.g., census data).
…
The next step is feature extraction, where various computational
techniques are used to organize and extract elements of the data, such as
◦ Linking records across sources
◦ Using natural language processing (NLP) to extract and normalize
concepts and
◦ Matching of other patterns
17
7/18/2022
…
This is followed by statistical processing, where machine learning and
related statistical inference techniques are used to make conclusions from
the data.
The final step is the output of predictions, often with probabilistic
measures of confidence in the results.
18
7/18/2022
Prepare (If
Collect input data( Analyze the supervised) Test Use
data clean) data train algorithm algorithm
algorithm
19
7/18/2022
Data
In all healthcare organizations, clinical data takes a variety of forms
◦ Structured (e.g., images, lab results, etc.) to
◦ Unstructured (e.g., textual notes including clinical narratives, reports,
and other types of documents).
20
7/18/2022
Data inconsistencies
Inconsistent naming conventions such as “systolic blood pressure” vs
“blood pressure, systolic”
Inconsistent definitions such as how the date of admission is defined
across departments
Varying field lengths for the same data elements such as allowing patient
last name 50 for one and 25 for another
Varied data elements: such as M, F, or U for patient gender in one and 1,
2, 9 for another system
Data dictionaries
1st step: obtain the data dictionary to understand your data
21
7/18/2022
…
A standard definition of data elements which creates transparency
Enable analyst to report consistently and accurately
22
7/18/2022
Population
A group of things that have something in common
Example: Patient in a particular hospital, Patient with certain diagnosis,
Patient who has certain surgical procedure
23
7/18/2022
Sample
A representative portion or subset of a group of things
Part of a population
Confidence intervals
•How well a sample approximate the entire population
•Often set at 95%
Data set: collection of data for specific purpose E.g. collection of 500
records that consists of age, gender, state of residence…
24
7/18/2022
Patient Identifiers
Benefits
◦ Easy linkage of record
◦ Facilitate health information exchange
◦ Reduce errors and costs arising duplicate and overlaid records
Duplicate and overlaid records
Duplicate records: when a single individual has more than one
identifier
Overlaid records: when more than one individual share the same
identifier
25
7/18/2022
26
7/18/2022
Conclusion
Clearly there is great promise ahead for healthcare, driven by data analytics
The growing quantity of clinical and research data, along with methods to
analyze and put it to use, can lead to improve personal health, healthcare
delivery, and biomedical research.
However there is also a continued need to improve the completeness and
quality of data as well as conduct research to demonstrate how to best apply it
to solve real world problems.
In addition, human expertise, including in informatics, will be required to
optimally carry out such work.
27
7/18/2022
Qui…
END
Thank you!
28