0% found this document useful (0 votes)
65 views43 pages

ECM en AI

The document discusses enterprise content management (ECM) and artificial intelligence (AI). It defines ECM as the process of creating, storing, distributing, discovering, archiving and managing unstructured content. It also defines AI and discusses how technologies like natural language processing and cognitive computing can be applied to ECM for knowledge management. Specifically, it provides examples of how AI can be used within ECM for tasks like document searches, data categorization, knowledge extraction, and robotic process automation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views43 pages

ECM en AI

The document discusses enterprise content management (ECM) and artificial intelligence (AI). It defines ECM as the process of creating, storing, distributing, discovering, archiving and managing unstructured content. It also defines AI and discusses how technologies like natural language processing and cognitive computing can be applied to ECM for knowledge management. Specifically, it provides examples of how AI can be used within ECM for tasks like document searches, data categorization, knowledge extraction, and robotic process automation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

AI and ECM

Utrecht, October 18th 2019


Reinoud Kaasschieter
• What is Enterprise Content Management?

• What is Artificial Intelligence?

• ECM and AI

• Knowledge Management
• Natural Language Processing Table of
Contents
• Cognitive Computing

© Capgemini 2019. All rights reserved | 2


What is Enterprise
Content Management?
What is ECM?
Gartner
Enterprise content management
(ECM) is used to create, store,
distribute, discover, archive and
manage unstructured content
(such as scanned documents,
email, reports, medical images
and office documents), and
ultimately analyze usage to enable
organizations to deliver relevant
content to users where and when
they need it.
https://www.gartner.com/it-glossary/enterprise-content-
management-ecm/

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 4
What is unstructured data?

Unstructured data are data that have no fixed data model, and
are not arranged in a fixed pre-defined manner
• Without preprocessing, unstructured data cannot be stored in
a table
• Examples: social media (tweets, blogs, posts, etc.), call
center data, email, surveys with open questions, etc.

Report RUGCIC-2016-01 - ISBN 978-90-367-9021-5

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 5
What is unstructured data?

Unstructured data are strongly linked to the three V’s of ‘Big


Data’:
• Volume: unstructured data typically require more storage
space than structured data
• Velocity: the amount of unstructured data is increasing more
rapidly than the amount of structured data
• Variety: unstructured data are generated in previously
untapped data sources, which may reveal very personal
customer information

Report RUGCIC-2016-01 - ISBN 978-90-367-9021-5

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 6
Hierarchy of data in Big Data
• Data having a defined data model, format,
structure
Structured • E.g. Database

• Textual data files with an apparent pattern,


Semi- •
enabling analysis
E.g. Spreadsheets and XML files
structured

• Textual data with erratic formats that can be


formatted with effort and software tools
Quasi-structured • E.g. Clickstream data

• Data that has no inherent structure and is


usually stored as different types of files.
Unstructured • E.g. text documents, PDFs, images, and
videos
EMC

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 7
My personal view

Enterprise Content Management Big Data and Data Analytics


• Sees a document or file as a • Sees a document as a
whole. container of data.
• Interested in describing the • Interested in describing the data
document as metadata. elements by structuring it.
• Up-front classification for • Classification - when needed -
retrievability and maintenance. for consummation.
• Aim is to make content of the • Aim is to analyse content of the
document accessible. document for insights.
• Keeping data relevant and • The decreasing relevancy of the
unchanged over time. data over time (and its changing
nature).
Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 8
Why Big Data will disrupt Document Management

Five reasons: Five things to be in place:


• Unstructured data is nothing • Information Life Cycle
special, just another data type. • Security & Privacy
• Storage and processing are no • Pre-processing
longer limiting. • Preservation
• Operational value on actionable • Versioning
insights, not just processes.
• Going beyond search with https://www.capgemini.com/2015/07/5-things-big-data-can-learn-
from-ecm-part-1-of-4/

Business Intelligence.

▃▃▃
• AI and ML developments are
tailored for Big Data
environments.
Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 9
What is Artificial
Intelligence?
What is AI?

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 11
What is AI?

Back in the 1950s, the fathers of the field Minsky


and McCarthy, described artificial intelligence as
any task performed by a program or a machine
that, if a human carried out the same activity, we
would say the human had to apply intelligence
to accomplish the task.

https://www.zdnet.com/article/what-is-ai-everything-you-need-to-know-about-artificial-intelligence/

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 12
What is intelligence?

The ability to acquire and apply knowledge and


skills.

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 13
What does AI do?

• AI does prediction

• AI does pattern recognition

• AI does monitoring

• AI does knowledge management


Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 14
Artificial Intelligence and Machine Learning

• Artificial intelligence (AI) is an umbrella term for a branch of


advanced computer science that attempts to build machines capable of
intelligent behaviour. It replicates human attempts to carry out tasks
and solve problems – but is much, much faster.

• Machine learning is a sub-branch of AI. It allows computers to learn


from large amounts of data without the need to explicitly program
them. Machine learning systems also learn from past behaviour to
predict future behavior.
https://www.bandt.com.au/opinion/difference-ai-machine-learning-means-future-work
Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 15
Louise Matsakis:

«Engineers may ultimately need to make a choice


between building automated systems that are the
most accurate, versus ones that are the most
similar to humans.»

https://www.wired.com/story/adversarial-examples-ai-may-not-hallucinate/

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 16
It's all about the fundamentals

• Get your data in order

• Fix the infrastructure

• Build trust in the algorithms

https://gcn.com/articles/2019/05/08/ai-roundtable.aspx

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 17
ECM and AI
How can we use AI within ECM?

AI can help your organization to derive value from


unstructured data.

Based on algorithms, AI technologies are capable of analyzing


data, finding solutions to the problems analyzed, and making
automated decisions about concrete outcomes. Technologies like
these can easily make content more useful, workflows more
efficient and interaction more productive.

https://www.armedia.com/blog/ai-ecm-unstructured-data/

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 20
How can we use AI within ECM?

• Document and image searches


• Data categorization and indexing
• Intelligent redaction (e.g. obscurification) and content
creation
• Knowledge extraction and ranking
• Unlocking of siloed, historical data
• Robotic Process Automation (RPA)

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 21
For example: input management

No AI yet Enhanced with AI AI only


• Digitization • Optical Character • Content Classification
• Conversion Recognition • Concept Extraction
• Image Enhancement • Intelligent Character • Sentiment Analysis
• Barcode Recognition • Labelling (Annotation)
• Optical Mark Recognition • Handwritten Character • Intelligent Search
Recognition
• Forms Recognition


• Metadata Extraction


• Speech to Text


Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 22
How is AI used within Enterprise Content Management?
In my humble opinion:

• AI is mainly used to enhance existing tasks.


• AI will be packaged in existing packages for input, storage, output etc.
• Leverage for AI will come from Big Data, not from ECM itself.

AI within ECM will only kick-off when…


• we don’t look at documents or files, but at unstructured data,
regardless of form, source or media;
• information management and information governance are present;
• we have a proper understanding how, where, when and by whom
information is consumed.

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 23
How far do you want to go?

(feature)
recognition classification validation indexing interpretation
extraction

form and
data document (meta)data
text validation
extraction classification export
recognition

Context Content
(of use) (to be used)

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 24
Knowledge systems
Words

“Words (…) are full of echoes, of memories, of


associations – naturally. They have been out
and about, on people’s lips, in their houses, in
the streets, in the fields, for so many
centuries. And that is one of the chief
difficulties in writing them today – that they
are so stored with other meanings, with other
memories, that they have contracted so many
famous marriages in the past.”
Virginia Woolf

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 26
Language and reality

Does the Language We Speak Affect Linguistic objects

«The Phenomenon of Science - a cybernetic approach to human evolution»


Our Perception of Reality? Brain
Black-
“(…) studies have found effects of box

language on how people construe events, Linguistic


representations
reason about causality, keep track of
number, understand material substance, Semantics Actions

perceive and experience emotion, reason


Non-linguistic
about other people’s minds, choose to representations
take risks, and even in the way they

by Valentin F. Turchin
choose professions and spouses.”

Lera Boroditsky Non-linguistic reality

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 27
Knowledge

The DIKW Pyramid

My Cognitive
Computing

https://www.climate-eval.org/blog/answer-42-data-information-and-knowledge
Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 28
Natural
Language
Processing

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 29
Natural Language Processing and Artificial Intelligence

Judith Hurwitz and Daniel Kirsch – Machine L4earning for Dummies - IBM Edition
Natural Language
Processing (NLP):
NLP is the ability to
train computers to
understand both
written text and
human speech. (…)
Unlike structured
database information
that relies on
schemas to add
context and meaning
to the data,
unstructured
information must be
parsed and tagged to
find the meaning of
the text.
Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 30
Natural Language Processing (NLP)

Basic NLP tasks include:


• Tokenization and parsing
• Lemmatization/stemming
• Part-of-speech tagging
• Language detection
• Identification of semantic
relationships

https://nlp.stanford.edu/~wcmac/papers/20140716-UNLU.pdf

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 31
IBM Watson – How does it works?

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 32
Starting with ECM and Natural Language Processing

Healthy data Taxonomy Organisation


• High quality • Pre-defined, or • Knowledge
• Accurate • Crafted (manual or • People
• Actual augmented*) • Trust
• Non-ROT • Matching • Records
• Maintained Management*


Utrecht | June 18th 2019 | Reinoud Kaasschieter
 © Capgemini 2018. All rights reserved | 33
Cognitive
computing

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 34
Cognitive Computing Complements Traditional Analytics
By creating a value continuum for the industry

Analytics Cognitive Computing


• Addresses predefined problems • Addresses ambiguous problems
• Provides accurate and definite • Provides answers with a margin
answers of error
• Handles information with • Handles information without
known semantics explicitly knowing semantics
• Interacts in formal digital • Interacts in a natural language
means (e.g. commands, with humans
screens) with humans

Source: IBM

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 35
Defining Cognitive Computing
Cognitive computing: solving problems with humanlike thinking

• First, artificial intelligence does not try to mimic human thought


processes. Instead, a good AI system is the simply the best
possible algorithms for solving a given problem
• Second, cognitive computing does not make decisions for
humans, but rather supplements our own decision-making.
• My blog post: «Augmented artificial intelligence: Will it work?»
 https://tinyurl.com/augmentedai

https://www.rtinsights.com/whats-the-difference-between-cognitive-computing-and-ai/Defining Cognitive Computing

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 36
Defining Cognitive Computing
Function, scope and limitations

1. Engagement: the systems are able to engage in deep dialogue


with humans.
2. Decision: decisions made by cognitive systems continually evolve
based on new information, outcomes, and actions.
3. Discovery: discovery involves finding insights and understanding
vast amount of information and developing skills.

https://www.marutitech.com/cognitive-computing-features-scope-limitations/

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 37
Defining Cognitive Computing
Discovering deep knowledge by delving through large
Discovery amounts of data and detecting unseen relation between
information elements, beyond what is humanly possible.

Learning through
Understanding and Applying reasoning
expanding and
applying knowledge and ethics
feedback
• Facts, information, and skills • Learning depends on the
acquired through experience ability to trace why the
or education; the theoretical particular decision was made
or practical understanding and change the confidence
of a subject. score of a systems response.
• Awareness or familiarity • Rhetorical question: how
gained by experience of a many software systems have
fact or situation. proper feedback loops
implemented?

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 38
Defining Cognitive Computing
Limitations of cognitive computing

1. Limited analysis of risk


 Real-world models

2. Meticulous training process


 Curation of data, information and content

3. More intelligence augmentation rather than artificial intelligence


 Errors and ethics

Learn more  http://www.discovery.com/ThisIsAI

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 39
Defining cognitive computing
Examples of products

Google DeepMind Microsoft Cognitive Services IBM Watson

The results (...) show that our Because the Cognitive Crédit Mutuel found that a
AI system can quickly Services APIs harness the significant part of their work
interpret eye scans from power of machine learning, involved answering simple
routine clinical practice with we were able to bring and repetitive questions. With
unprecedented accuracy. advanced intelligence into our this in mind, the bank turned
product without the need to to IBM to find a solution that
https://deepmind.com/blog/moorfields-
major-milestone/
have a team of data scientists could speed up everyday
on hand. processes and allow client
advisors time to address more
https://azure.microsoft.com/nl-
nl/services/cognitive-services/ complicated and nuanced
problems.
https://www.ibm.com/watson/stories/creditm
utuel/

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 40
Defining cognitive computing
Examples of applications

Deutsche Bundeswehr Paralegal ROSS Watson for Oncology

The IBM program predicts Ross improves upon existing Watson for Oncology
“potential crises” before they alternatives by actually combines leading oncologists’
occur over the next six to 18 understanding your questions deep expertise in cancer care
months. in natural language like - Can with the speed of IBM Watson
a bankrupt company still to help clinicians as they
http://www.bundeswehr-
journal.de/2018/blick-in-die-zukunft-big-
conduct business? Ross then consider individualized cancer
data-software-fuer-die-bundeswehr/ provides you with an instant treatments for their patients.
answer with citations and
https://www.ibm.com/us-
suggests highly topical en/marketplace/ibm-watson-for-oncology
readings from a variety of
content sources.
https://medium.com/@innovationKEY/hi-i-m-
ross-your-new-paralegal-may-i-be-your-
watson-21b9a14ad263

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 41
Cognitive Computing and Artificial Intelligence

The difference between AI and Machine


Learning (…) is like the difference between
economics and accounting.

Parmy Olson, Forbes

Many conventional AI systems are merely


machine learning, or neural networks, or deep
learning. They’re good at handling large sets of
data but lack situational awareness or the
ability to navigate around missing or
incomplete data. They get stuck.

AJ Abdallat, CEO of Beyond Limits


http://www.automatedtrader.net/articles/artificial-intelligence/153528/best-of-the-
blogs-_-ai--machine-learning--data-mining--and-big--data
Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 42
Pitfalls and critical points
Real world experiences from Watson for Oncology

1. Wrong diagnosis and treatments


Errare humanum est: o so human errors
Cultural bias: Sloan-Kettering (US) oriented methods
2. Feedback loops and learning
Learning: incomplete and unreadable reports as feedback
Curating content: keeping up-to-date
3. Transparency
Being judgmental: fool or genius?
The Why-question: why is this data selected? Why is that data not selected?
4. Added value and cost effectiveness
How to excel: only with miscellaneous or exceptional cases
Value creation: when humans make the right decisions, what's the added value?

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 43
Some quality issues

Learning,
Unstructured Truthful- Content
Data quality improving &
data ness Curation
expanding

• Completeness • Readability • (Unconscious) Bias


It’s all about the It’s still all about the
• Consistency • Structurability • Lifecycle process process
(e.g. PDF)
• Conformity • Context
• Interpretability
• Accuracy
• Classification
• Integrity
• Timeliness

Further reading  Machine Intelligence quality characteristics


https://www.sogeti.com/explore/reports/machine-intelligence-quality-characteristics/

Utrecht | June 18th 2019 | Reinoud Kaasschieter © Capgemini 2018. All rights reserved | 44

You might also like