0% found this document useful (0 votes)
7 views

Chapter 3

The document discusses healthcare data analytics including data, information, knowledge, and the DIKW pyramid. It outlines topics of healthcare data analytics including data types, analytics pipeline, machine learning types and steps, and challenges of data analytics.

Uploaded by

kaleabs321
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Chapter 3

The document discusses healthcare data analytics including data, information, knowledge, and the DIKW pyramid. It outlines topics of healthcare data analytics including data types, analytics pipeline, machine learning types and steps, and challenges of data analytics.

Uploaded by

kaleabs321
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

7/18/2022

Chapter Three
Healthcare Data Analytics

29/10/2014 E.C BY: MATIWOS T. 1

Revision: Healthcare Data, Information and Knowledge


Data
Symbols, facts, measurements
Variety of data types
• Integers, Floating point numbers,
Characters, Strings
With variety of file formats
• image, text, audio, video

HI and Telemedicine
29/10/2014 E.C BY: MATIWOS T.

1
7/18/2022


Information Knowledge
Meaningful data or facts from Application of data and information
which conclusions can be drawn by Information that is justifiably
humans or computers considered to be true
Provides the “ who, what, when,
Provides the “how”
where”

HI and Telemedicine
29/10/2014 E.C BY: MATIWOS T.

Wisdom
Critical use of knowledge to make intelligent decisions
Assimilates knowledge and experience and allows us to be judgmental
about the right kind of action in a given situation
Provides the “why”

29/10/2014 E.C BY: MATIWOS T. 4

2
7/18/2022

DIKW pyramid

 There is much more data than information, knowledge or wisdom.


 As data are consumed and analyzed the amount of knowledge and wisdom produced is
much smaller

29/10/2014 E.C BY: MATIWOS T. 5

DIKW Pyramid

Wisdom

Knowledge

Information
• Provides meaning
• Objective in nature
Data
• Abundant
• Meaningless

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

3
7/18/2022

Informatics Vs IT and Computer Science


Computer scientists develop algorithms to search or sort data in which
what is being sorted or searched is largely irrelevant.
•i.e. The meaning of the data is of secondary importance
Information and knowledge, on the other hand, are addressed by
informatics.
To an informatician, computers are tools for manipulating information.

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Converting Data to Information to Knowledge


Conceptual and computational model
The distinction between
The real (represented) world
The conceptual model (representing world) and
The computational model (that which the computer manipulates)
is fundamental to informatics.
Transformation of information (meaningful data) into knowledge
(justified, true belief) is a core goal of science.

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

4
7/18/2022

Example 1
Problem: patient B is allergic to penicillin. He was recently prescribed amoxicillin for his sore
throat.
Solution:
Data: Penicillin, amoxicillin, sore throat
Information:
 Patient B has penicillin allergy
 Patient B was prescribed amoxicillin for his sore throat
Knowledge:
 Patient B may have allergic reaction to his prescription
Wisdom:
 Patient B should not take amoxicillin!!!

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

…Healthcare Data Analytics

29/10/2014 E.C BY: MATIWOS T. 10

5
7/18/2022

Outlines
Introduction Data
Definition Types
Types Data inconsistency
Steps Data dictionaries
The Analytics Pipeline Common terms used in statistical
Machine learning analytics
Types Challenges of data analytics
Steps

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Introduction
 One of the promises of the growing critical mass of clinical data
accumulating in electronic health record (EHR) systems is secondary use
(or re-use) of the data for other purposes, such as
◦ Quality improvement and
◦ Clinical research
 The analysis of this data is usually called analytics (or data analytics)

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

6
7/18/2022

“Information is the
oil of the 21st century
and analytics is the
combustion engine” Peter Sondeegard

HI and Telemedicine
29/10/2014 E.C BY: MATIWOS T.

What is analytics ?

The discovery of meaningful patterns in data

Entire process of data collection, extraction,


transformation, analysis, interpretation and
reporting
The analytics process is the synthesis of
knowledge from information

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

7
7/18/2022

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Types of Analytics
• Descriptive –describe current situations and problems
• Uses business intelligence and data mining to ask “What has
Three levels happened?”
of analytics, • Predictive – simulation and modeling techniques that identify
trends and foreshadow outcomes of actions taken
each with • Uses statical models and forecasts to ask ”What could
increasing happen?”
• Prescriptive – optimizing clinical, financial, and other outcomes
functionality • Uses optimization and simulation to ask “What should we
and value: do?” By: IBM
• Diagnostic: examine data to answer “Why did it happen” By:
Gartner

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

8
7/18/2022

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Descriptive analytics
 Describe the data
 Common statistics:
◦ Counts
◦ Averages
 Typical reporting methods:
◦ Tables
◦ Pie charts
◦ Bar charts

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

9
7/18/2022

Diagnostic analytics
•Attempts to answer “Why did it happen”
•Tools used:
• Drill-down techniques
• Data discovery
• Correlations

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Predictive analytics
Predicts instead of describing or classifying
Rapid analysis
Relevant insight
Attempting to optimize health
and financial outcomes

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

10
7/18/2022

What predictive analytics can’t do

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Prescriptive analytics
 Examines data or content to answer the question “What should be
done?” or what can we do to make something happen
 Is characterized by techniques such as
◦ Graph analysis
◦ Simulation
◦ Neural networks
◦ Machine learning

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

11
7/18/2022

Steps in data analytics


1. Identify the problem and the stakeholders
2. Identify what data are needed and where those data are located
3. Develop a plan for analysis and retrieval
4. Extract/ Transform/ Load the data
5. Check, clean and prepare the data for analysis
6. Analyze and interpret the data
7. Visualize the data
8. Disseminate the new knowledge
9. Implement the knowledge in to the organization

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

1. Identify the problem or question and the stakeholders


Why is this an
important
problem

How will the


results impacts
patient care or
the institution

Who are the


stake holders

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

12
7/18/2022

2. Identify what data are needed

What data
elements such as
data of birth, Where are theses Is there a clinical Who is the
gender, data elements data warehouse contact person
medication, located
laboratory results,
etc. are needed

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

3. Develop plans for retrieval and analysis


Develop specific plan for
retrieving the required data
elements

Retrieval
Method for cross checking
number of records as well
as completeness_ how
many should we expect
and did we get everything?

Identify population, sample


Analysis size, statistical test, to be
performed

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

13
7/18/2022

4. Extract/Transform/Load (ETL) process

Extraction Transformation Loading

The data are Checked for Errors corrected, Data synchronized Data imported to
retrieved completeness empty fields ”transformed” e.g the destination
addressed M, F, U Vs 1,2,3 system

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

5. Check, clean and prepare the data

Should be a Need to check Double check Double check


complete set of that everything is problem or against analysis
data ready for analysis question being plan
investigated

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

14
7/18/2022

6. Analyze and interpret the data

Perform the Consult with


actual statistician to
Use the data statistical confirm
analysis plan analysis as interpretations
described in and
the plan conclusion

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

7. Visualize the data


oClear and understandable data should be communicated to the decision
makers
oNominal (categorical) data: Column or bar charts, tables, pie charts,
pivot tables
oQuantitative data: histograms, scatter plots, star plots.

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

15
7/18/2022

8 & 9. Disseminating and Implementing

Disseminating the new knowledge Implementing the new knowledge

Write up the findings


Requires participation
of stakeholders
Disseminate to the
stakeholders

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

The Analytics Pipeline

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

16
7/18/2022


 The pipeline begins with input data sources, which in healthcare and
biomedicine may include
◦ Clinical records
◦ Financial records
◦ Genomics and related data, even those from outside the healthcare
setting (e.g., census data).

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine


The next step is feature extraction, where various computational
techniques are used to organize and extract elements of the data, such as
◦ Linking records across sources
◦ Using natural language processing (NLP) to extract and normalize
concepts and
◦ Matching of other patterns

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

17
7/18/2022


 This is followed by statistical processing, where machine learning and
related statistical inference techniques are used to make conclusions from
the data.
 The final step is the output of predictions, often with probabilistic
measures of confidence in the results.

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Methodology of data analytics


Machine learning: a core methodology that aims to build
systems and algorithms that learn from data
Data mining: major techniques of machine learning which
process and model large amounts of data to discover
previously unknown patterns or relationships
• Text mining: a subarea of data mining which applies data mining
techniques to mostly unstructured textual data

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

18
7/18/2022

Major types of learning


Supervised: learn to predict a known output
• Learns form training data and evaluated on test data to avoid “overfitting”
• E.g. predict diagnosis on Interpretation of ECG, detection of abnormalities on
chest x-ray, Predicting risk of coronary heart diseases
Unsupervised: find naturally occurring patterns or groupings
within data
• E.g. discovery of new attributes associated with diagnosis, treatment or
prognosis of diseases

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Steps in machine learning

Prepare (If
Collect input data( Analyze the supervised) Test Use
data clean) data train algorithm algorithm
algorithm

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

19
7/18/2022

Data
 In all healthcare organizations, clinical data takes a variety of forms
◦ Structured (e.g., images, lab results, etc.) to
◦ Unstructured (e.g., textual notes including clinical narratives, reports,
and other types of documents).

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Types of data in EHR


Quantitative data (e.g. laboratory values)
Qualitative data (e.g. text based documents and demographics)
Transactional data (e.g. record of medication delivery)
 Different forms of data determine what can or can’t be done with the
data e.g. two patient names can’t be added together

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

20
7/18/2022

Data inconsistencies
Inconsistent naming conventions such as “systolic blood pressure” vs
“blood pressure, systolic”
Inconsistent definitions such as how the date of admission is defined
across departments
Varying field lengths for the same data elements such as allowing patient
last name 50 for one and 25 for another
Varied data elements: such as M, F, or U for patient gender in one and 1,
2, 9 for another system

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Data dictionaries
 1st step: obtain the data dictionary to understand your data

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

21
7/18/2022


A standard definition of data elements which creates transparency
Enable analyst to report consistently and accurately

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

22
7/18/2022

Common terms used in statistical analytics


Population
Sample
Confidence interval
Data set
Correlation vs causation

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Population
A group of things that have something in common
Example: Patient in a particular hospital, Patient with certain diagnosis,
Patient who has certain surgical procedure

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

23
7/18/2022

Sample
A representative portion or subset of a group of things
Part of a population

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Confidence intervals
•How well a sample approximate the entire population
•Often set at 95%
 Data set: collection of data for specific purpose E.g. collection of 500
records that consists of age, gender, state of residence…

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

24
7/18/2022

Correlation and Causation


Correlation: relationship between two things
Causation: one cause another
Correlation doesn’t equal causation

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Patient Identifiers
 Benefits
◦ Easy linkage of record
◦ Facilitate health information exchange
◦ Reduce errors and costs arising duplicate and overlaid records
 Duplicate and overlaid records
Duplicate records: when a single individual has more than one
identifier
Overlaid records: when more than one individual share the same
identifier

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

25
7/18/2022

Key attribute of patient identifiers


Unique: only one person has a particular identifier
Non-disclosing: discloses no personal information
Permanent: will never be reused
Canonical: each person has only one
Invariable: will not change overtime

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

Challenges to Data Analytics


Inaccurate or incomplete data: may be transformed in ways that
undermine its meaning
Data may also incompletely adhere to well-known standards, which
makes combining it from different sources more difficult
Clinical data mostly only allows observational and not experimental
studies, thus raising issues of cause-and-effect of findings

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

26
7/18/2022

Conclusion
Clearly there is great promise ahead for healthcare, driven by data analytics
The growing quantity of clinical and research data, along with methods to
analyze and put it to use, can lead to improve personal health, healthcare
delivery, and biomedical research.
However there is also a continued need to improve the completeness and
quality of data as well as conduct research to demonstrate how to best apply it
to solve real world problems.
In addition, human expertise, including in informatics, will be required to
optimally carry out such work.

29/10/2014 E.C BY: MATIWOS T. Health Informatics and Telemedicine

29/10/2014 E.C BY: MATIWOS T. 54

27
7/18/2022

Qui…

29/10/2014 E.C BY: MATIWOS T.

END

Thank you!

29/10/2014 E.C BY: MATIWOS T. 56

28

You might also like