Knowledge Graphs Data in Context Responsive
Knowledge Graphs Data in Context Responsive
m
pl
im
en
ts
of
Knowledge
Graphs
Data in Context for
Responsive Businesses
REPORT
Knowledge Graphs
Data in Context for
Responsive Businesses
978-1-098-10483-2
[LSI]
Table of Contents
Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
What Are Graphs? 2
The Motivation for Knowledge Graphs 7
Knowledge Graphs: A Definition 7
iii
Graph Algorithms 45
Graph Embeddings 47
ML Workflows for Graphs 48
Graph Visualization 49
Decisioning Knowledge Graph Use Cases 50
Boston Scientific’s Decisioning Graph 52
Better Predictions and More Breakthroughs 53
5. Contextual AI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Why AI Needs Context 55
Data Provenance and Tracking for AI Systems 59
Diversifying ML Data 61
Better ML Processes 62
Improving AI Reasoning 64
The Big Picture 67
iv | Table of Contents
Foreword
The winners in the data industry know where the puck is going:
making data smarter. This can be accomplished by integrating data
with knowledge at scale, and this is where knowledge graphs come
in. This book is a practical guide to understand what knowledge
graphs are and why you should care. Importantly, it strikes the right
balance between technical aspects and corresponding business value
for an organization. If you need to make the business case for
knowledge graphs, this is the book for you!
Let’s first talk about the elephant in the room: RDF versus property
graphs. Over the years, I’ve enjoyed my conversations with Jesús
Barrasa on this topic. We have always been strong believers that
these technologies will converge because at the end of the day, it’s all
just a graph! This book is evidence of this convergence: enriching
the property graph model with taxonomies, ontologies, and seman‐
tics in order to create knowledge graphs. And don’t forget that the
conversation should focus on the business value and not just the
technology.
How do you get started on your knowledge graph journey?
First, one of my mantras is don’t boil the ocean. This means that your
knowledge graph journey should start simple, be practical, and focus
on the business return coming from the right amount of semantics.
Second, I always say that you need to crawl, walk, and run. Crawling
means to start by creating a knowledge graph of your metadata that
catalogs the data within your organization. I’m thrilled to see that we
are fully aligned: effective data integration solutions rely on under‐
standing the relationships between data assets, which is at the heart of
knowledge graphs. Furthermore, in the AI and ML era that we live
v
in, understanding the quality and governance is key for effective
decision-making.
Speaking of AI, knowledge graphs are changing AI by providing
context. This leads to explainability, diversification, and improved
processing. If AI is changing the future and knowledge graphs are
changing AI, then by transitivity, knowledge graphs are also chang‐
ing the future.
If you are still asking yourself “why knowledge graphs?,” guess
what...your competitors aren’t! Don’t be that person! Jesús, Amy, and
Jim have written this book just for you.
vi | Foreword
CHAPTER 1
Introduction
1
show how we can build systems with smarter data using knowledge
graph techniques.
2 | Chapter 1: Introduction
had to walk to see the town of Königsberg (modern-day Kalinin‐
grad) by ensuring that each of its seven bridges was crossed only
once, as shown in Figure 1-2.
Euler’s insight was that the problem shown in Figure 1-2 could be
reduced to a logical form, stripping out all the noise of the real
world and concentrating solely on how things are connected. He
was able to demonstrate that the problem didn’t need to involve
bridges, islands, or emperors. He proved that in fact the physical
geography of Königsberg was completely irrelevant.
Using the superimposed graph in Figure 1-2, you can try to figure
out the shortest route for walking around Königsberg without hav‐
ing to put on your walking boots and try it for real. In fact, Euler
proved that the emperor could not walk the whole town crossing
each bridge only once, since there would have needed to be (at least)
one island (node) with an even number of connecting bridges (rela‐
tionships) from which the emperor could start his walk. No such
island existed in Königsberg.
4 | Chapter 1: Introduction
Each of the graph models has its own quirks and benefits. In con‐
temporary IT systems, enterprises have mostly settled on the prop‐
erty graph model. It’s a model that is well suited to common data
management problems and straightforward for software and data
professionals to work with. To illustrate the property graph model,
we’ve created a tiny social graph in Figure 1-3, but compared to the
example in Figure 1-2, this graph is far richer.
In Figure 1-3 each node has a label that represents its role in the
graph. Some nodes are labeled Person and some labeled Place, rep‐
resenting people and places respectively. Stored inside those nodes
are properties. For example, one node has name:'Rosa' and gen
der:'f' that we can interpret as being a female person called Rosa.
Note that the Karl and Fred nodes have slightly different properties
on them, which is perfectly fine too. If we need to ensure that all
Person nodes have the same property keys, we can apply constraints
to the label to ensure those properties exist, are unique, and so on.
Between the nodes in Figure 1-3 we have relationships. The relation‐
ships are richer than in Figure 1-2, since they have a type, a direc‐
tion, and can have optional properties on them. The Person node
with the name:'Rosa' property has an outgoing LIVES_IN relation‐
ship with property since: 2020 to the Place node with city:'Ber
lin' property. We read this in slightly poor English as “Rosa lives in
Berlin since 2020” and definitely not that Berlin lives in Rosa! We
also see that Fred is a FRIEND of Karl and that Karl is a FRIEND of
FRED. Rosa and Karl are also friends, but Rosa and Fred are not.
Graph Databases
Finding connections between data points is a natural and powerful
way of making information discoveries. Graphs and graph theory
are amazing tools in their own right for modeling and analyzing
data. Graph Databases, 2nd Edition (O’Reilly, 2015) by Ian Robin‐
son, Jim Webber, and Emil Eifrem can help you understand how to
use a graph database to power your systems.
It’s easy to see how the graph in Figure 1-3 can answer questions
about friendships and who lives where. Extending the model to
include other important data items like interests, publications, or
jobs is also straightforward. Just keep adding nodes and relation‐
ships to match your problem domain. Creating large, complex
graphs with many millions or billions of connections is not a prob‐
lem for modern graph databases and graph-processing software, so
building even very large knowledge graphs is possible.
Graph data models are uniquely able to represent complex, indirect
relationships in a way that is both human readable, and machine
friendly. Data structures like graphs might seem computerish and
off-putting, but in reality they are created from very simple primi‐
tives and patterns. The combination of a humane data model and
ease of algorithmic processing to discover otherwise hidden patterns
and characteristics is what has made graphs so popular. It’s a combi‐
nation we will exploit in our knowledge graphs.
1 Human relationships such as love are often symmetric, but we express that symmetry
with two relationships.
6 | Chapter 1: Introduction
Now that we’re comfortable with graphs, we move forward to inter‐
preting connected data as knowledge.
8 | Chapter 1: Introduction
CHAPTER 2
Building Knowledge Graphs
9
There are several different approaches to organizing data in a graph,
each with its own benefits and quirks. We’re free to pick and choose
the ones that are best suited to our problem, and we’re free to com‐
pose them together too. Starting with a basic (but useful) graph,
we’ll show how to add successive layers of organization, demonstrat‐
ing how knowledge graphs can be used to solve increasingly sophis‐
ticated problems.
In Figure 2-2 we’ve used the organizing principle from the property
graph model to showcase various product features. A competent
user could extract all wireless headphones from the data as easily as
they could extract all audio products. With only slightly more effort,
aggregate sales data per product or per product type could also be
computed. But the way the data is set up does not provide much
scope for reasoning: labels don’t provide enough information to
know that, for example, one product is substitutable for another. For
that we need a stronger organizing principle.
2 Strictly speaking, taxonomies are also ontologies, but they only use the basic constructs
of categories and hierarchical relationships.
3 Not all the data will necessarily be in a single database. Generally, we require some sys‐
tems integration work so that the data can be accessed over the network from the
myriad of databases where it is stored.
4 Interestingly enough, the vast majority of the public RDF data in the web is not hosted
in triple stores but embedded in web pages in the form of JSON-LD snippets, further
proving this point.
1 According to the ICIJ, the Panama Papers was a “giant leak of more than 11.5 million
financial and legal records exposes a system that enables crime, corruption and wrong‐
doing, hidden by secretive offshore companies.”
27
But that was not the end of the story. The work of the ICIJ is ongo‐
ing, and so the data team at ICIJ kept improving the data pipeline
for populating the knowledge graph. They delivered a freshly upda‐
ted graph to the journalists every day, despite ongoing changes like
new code releases, additional data sources, and new team members.
In many respects, the ICIJ’s journey is not so different from most
other data owners, chief data officers (CDOs), or chief information
officers (CIOs). A CDO or CIO has to ensure that data with the
right quality (accuracy, completeness, timeliness) is accessible to the
right people and processes and is used a way that complies with
rules and regulations.
This chapter explains the foundations for actioning knowledge
graphs, which are used to drive decisions or actions based on data.
We discuss knowledge graphs that support holistic understanding
and linking of multiple data sources across the enterprise and how
we can incorporate provenance metadata into the mixture.
Throughout, we provide several use cases and an end-to-end exam‐
ple to illustrate approaches a CDO or CIO could take.
Knowledge graphs like the one in Figure 3-1 can answer domain
questions like “What customers subscribe to service X?” but can also
provide confidence in the answer with provenance and governance
metadata. Importantly, data architects can implement this technique
in a noninvasive manner with respect to the source systems contain‐
ing customer data, by building it as a layer above those systems. A
popular example of this is to build a knowledge graph of metadata
that describes data residing separately in an otherwise murky data
lake.
A globally linked view of data unlocks many significant use cases.
Figure 3-1 shows the powerful combination of the data and meta‐
data planes in the graph, but it’s easy to see how we can construct a
complete view of a customer by linking disparate partial views of
customers across different systems. We could enrich the original
customer view with the customer’s behavior over their lifetime,
detect patterns for good and bad customers, and adapt processes to
suit their behavior. Similarly, on the metadata plane, we’ve started by
showing how to offer a catalog of richly described datasets connec‐
ted to the data sources where they are hosted, but we can also
semantically enrich them by linking them to vocabularies, taxono‐
mies, and ontologies to enable interoperability. Or we can add own‐
ership/stewardship information to enhance governance and map
where data came from as well as which systems and individuals have
handled it.
2 For example, the Neo4j graph database provides user-defined procedures that can call
out to other systems and combine the results from those systems with local data, which
is all processed as if it were a local graph query.
Metadata Management
Patterns of data usage have changed profoundly in the enterprise. In
organizations with mature data management practices, expertise
and control over data are shifting from a centralized setup to a dis‐
tributed setup where business units provide domain expertise and
systems architecture is often designed as a set of loosely coupled
cooperating services.
As a consequence, your organization probably has many data users,
and you have probably seen explosive growth in the number of
internal data resources: data tables, dashboards, reports, metrics def‐
initions, and so on. On the positive side, this shows your investment
in data-informed decision making, but how do you make sure your
users effectively navigate this sea of data assets of varying quality,
freshness, and trustworthiness? The solution to keeping data fresh is
a metadata hub, which has an actioning knowledge graph at its core.
Metadata Management | 33
Figure 3-3. A graph describing a simple data pipeline
39
Physically, our decisioning graph might or might not be the same
graph as our actioning knowledge graph. Sometimes it’s helpful to
keep all the actionable data and decision making together (particu‐
larly when we want to enrich the actioning knowledge graph), and
sometimes we want to physically separate them (for data science
workflows).
However we physically arrange our infrastructure, our toolkit for
these jobs consists of discovery, analytics, and data science. Sepa‐
rately, discovery, analytics, and data science are helpful. But together
they become extremely powerful tools in our toolbox for turning
decisions into useful actions.
Despite their predictive power, most analytics and data science prac‐
tices ignore relationships because it has been historically challenging
to process them at scale. Consider trying to find similar customers
or products in a three-hop radius of an account. With nongraph
technology, you might be able to process this data, even if it is
slower than querying a knowledge graph. But what if you need to
scale such processing over a large graph of your customer base, then
distill useful information (e.g., for every pair of accounts in this
radius, calculate the number of accounts in common), and finally
transform the results into a format required for machine process‐
ing? It’s just not practical in a nongraph system. This explosion of
complexity quickly overwhelms the ability to perform processing
and hinders the use of “graphy” data for predictions to the ultimate
detriment of decision makers.
Instead of ignoring relationships, knowledge graphs incorporate
them into analytics and ML workflows. Graph analytics excels at
finding the unobvious because it can process patterns even when we
don’t exactly know what to look for, while graph-based ML can
predict how a graph might evolve. This is precisely what most data
scientists are trying to achieve!
Queries
These are written by humans during an investigation and typi‐
cally produce human-readable results.
Graph Queries
Most analysts start down the path of graph analytics with graph
queries, which are (usually) human crafted and human readable.
They’re typically used for real-time pattern matching when we know
the shape of the data we’re interested in. For example, in Figure 4-2
we’re looking for potential allies in a graph of enemies on the basis
of the concept, “the enemy of my enemy is my friend.” Once a
potential ally has been located, we create a FRIEND relationship.
Unlike data discovery, where we’re asking a specific question for
investigation, here we use the query results to feed subsequent
analyses.
Graph Algorithms
But what if we don’t know where to start the query or want to find
patterns anywhere in the graph? We call these operations graph-
global, and querying is not always the right way to tackle these chal‐
lenges. A graph-global problem is often an indication that we should
instead consider a graph algorithm.
For more comprehensive analysis, where we need to consider the
entire graph (or substantial parts of it), graph algorithms provide an
efficient and scalable way to produce results. We use them to achieve
a particular goal, like looking for popular paths or influential nodes.
For example, if we’re looking for the most influential person in a
social graph, we’d use the PageRank algorithm,1 which measures the
importance of a node in the graph relative to the others.
In Figure 4-3 we see a snapshot of part of a graph. Visually, we can
see that the node representing Rosa is the most connected, but that’s
an imprecise definition.
If we run the PageRank algorithm over the data in Figure 4-3, we
can see that Rosa has the highest PageRank score, indicating that
she’s more influential than other nodes in the data. We can use this
metadata in knowledge graphs by incorporating it as part of the
organizing principle, just like any other data item, to drive users
toward good decisions.
Graph Algorithms | 45
Figure 4-3. Node importance via the PageRank algorithm
Graph Embeddings | 47
ML Workflows for Graphs
Analyzing the output of graph queries and algorithms and using
them to improve ML is great, but we can also write results back to
enrich the graph. In doing so, those results become queryable and
processable in a virtuous cycle that feeds the next round of algorith‐
mic or ML analysis. Creating a closed loop within our decisioning
knowledge graph means we can start with a graph, learn what’s sig‐
nificant, and then predict things about new data coming in, such as
classifying someone as a probable fraudster, and write it back into
the graph. The knowledge graph is enriched by the cycle shown in
Figure 4-4.
Graph Visualization
Visualizing data, especially relationships, allows domain experts to
better explore connections and infer meaning. For knowledge
graphs, we need tools like those shown in Figure 2-3 to help visually
inspect raw data, understand analytics results, and share findings
with others.
Walking through relationships, expanding scenes, filtering views,
and following specific paths are natural ways to investigate graphs. A
visual representation of a graph needs to be dynamic and user cus‐
tomizable to support interactive exploration. In addition to direct
query support, graph visualizations need low-code/no-code options
and assistive search features, like type-ahead suggestions, to
empower a broader range of users.
Data scientists also benefit from visualizing algorithm and ML
results. With the level of abstraction raised visually, a data scientist
can focus on the necessary complexity of an experiment and not
become bogged down in accidental complexity. For example, our
tools can visualize PageRank scores as node sizes, node classifica‐
tions as icons, traversal cost as line thickness, and community
groups as colors. With these visual abstractions, the important
aspects of the underlying data are immediately apparent, where they
would be hidden in raw data. Once a data scientist is satisfied with
Graph Visualization | 49
their results, a graph visualization tool enables them to quickly
prototype and communicate findings with others, even where those
others are not experts in graph technology.
Real-Time Decisioning
Real-time decisioning solutions require immediate analysis of cur‐
rent options before immediately matching them to the most appro‐
priate choice (i.e., making a recommendation). For instance, in
Chapter 2 we show how retail recommendations can be made in
real time based on a knowledge graph.
It’s important to understand that interactive speeds prohibit online
use of global algorithms and ML training. The computation cost
and latency to run large graph algorithms are simply too high to
run on a per-request basis. Instead, graph algorithms, data science,
and ML often operate on a different cadence to real-time queries.
The more expensive processing runs in the background, continu‐
ously enriching the actioning knowledge graph, while real-time
queries get better results over time as the underlying knowledge
graph improves.
Each of these use cases is valuable, and it’s possible that more than
one may apply to your business. What’s nice about these use cases is
that tooling already exists that can implement them. We don’t need
to invest in building such tools; we just need to get our data in a for‐
mat so that the tools can conveniently process it. Often this can be as
simple as storing it in a graph database. From here, we can incorpo‐
rate the knowledge graph into our business processes to help to
make better decisions.
55
Chinese) saw a friend’s pet bird or are telling them to help butcher
poultry for dinner. It’s hard to know the meaning of this text
without context. Likewise, if we had to meticulously review the
inputs in isolation to interpret the phrase in Figure 5-1, let alone all
the decision paths we make every day, the complexity would para‐
lyze us.
Figure 5-1. Linked context used to interpret the phrase “you saw her
duck”
56 | Chapter 5: Contextual AI
Today’s AI is not very able to generalize. Instead, it is effective for
specific, well-defined tasks. It struggles with ambiguity and mostly
lacks transfer learning that humans take for granted. For AI to make
humanlike decisions that are more situationally appropriate, it needs
to incorporate context.
Explainability
Predictions made by AI must be interpretable by experts and ulti‐
mately explainable to the public if AI systems are to expand their
utility into more sectors. In the absence of understanding how deci‐
sions were reached, citizens may reject recommendations or out‐
comes that are counterintuitive. In systems where human safety is
paramount, such as medical imaging, explainability becomes a crit‐
ical aspect of running a system that will not harm people. Explaina‐
bility isn’t a nice-to-have—it is a required component of AI, and
being context driven improves explainability.
Graph technology is the best way to maintain the context for
explainability. It offers a human-friendly way to evaluate connected
data, enabling human overseers to better map and visualize AI deci‐
sion paths. By better understanding the lineage of data (context of
where it came from, cleansing methods used, and so forth), we can
better evaluate and explain its influence on predictions made by the
AI model.
Figure 5-3. Simplified graph data model for lights in one area of a
smart home
58 | Chapter 5: Contextual AI
The point is that AI benefits greatly from context to enable probabil‐
istic decision making for real-time answers, handle adjacent scenar‐
ios for broader applicability, and be maximally relevant to a given
situation. But all systems, including AI, are only as good as their
inputs. Garbage in means garbage out, even if AI produces that
garbage. To ensure that we use good-quality data to build AI sys‐
tems, we need to understand that data’s provenance.
Tracking knowledge graphs helps with audit trail and existing com‐
pliance requirements too. Although AI norms and regulations are
still evolving, we need to have systems to manage an increasing
amount of this complex peripheral information.
60 | Chapter 5: Contextual AI
Diversifying ML Data
Knowledge graphs enable us to employ context to help solve some
of the most difficult issues in training data: small data sizes and the
lack of data variety. Since training data is often scarce, graphs can
help squeeze out every possible feature, including relationships and
topological information. This might require a change in mindset to
diversify the type of data regularly included in ML.
Suppose you’re not using a knowledge graph. In that case, the ten‐
dency is to demand more training data, typically of the same type of
information you’re already using. Additional data isn’t always readily
available or affordable. Even if we can obtain more data, we need to
be careful of diminishing returns and overfitting. Overfitting
involves heavily training predictive models on specific data or fea‐
tures, resulting in increased accuracy scores for that training data
but a decline when used on new data. Broadening the data types
used in ML, like adding relationships, is a simple tactic to guard
against overfitting and expands results for broader scenarios.
Contextual information describes how entities relate to one another
and events instead of describing static elements about individual
entities, and context is a fundamentally different type of informa‐
tion. When building AI systems, we can extract context from graphs
for more variety in training and improve predictions.
Imagine we’re trying to improve patient outcomes for a complex dis‐
ease like diabetes. We can train a predictive model on things like
current medications, test results, and patient demographics. Perhaps
our model tells us which medications by age and gender predict
good versus poor outcomes, and it’s 98% accurate according to our
training and testing data.
However, in practice we find these predictions have a high false-
positive rate. What factors could be missing? We could try to boost
our results by simply broadening the number of features used in the
model training, but that can lead to overfitting and still leaves out
the context in which this disease occurs. Since complex diseases
develop over time, the sequence of diagnosis and medications might
be influential. Or perhaps contextual information like lifestyle might
have a bigger impact than we expected.
Using graphs, we can capture the path of events over time down to
each doctor visit, test result, diagnosis, and medication change.
Diversifying ML Data | 61
Figure 5-5 illustrates a patient journey for diabetes, a very complex
disease, considering only a 90-day window around the diagnosis.
The paths in this relatively small journey quickly become unwieldy.
It’s clear that we need more advanced tools when we add multidi‐
mensional context. To incorporate context into an AI system, we
can use graph embedding to codify the topology of the journey
paths and make predictions based on all the surrounding informa‐
tion we have in context. We can also look at various predictive
aspects, like the different professionals involved, and see how that
might have a strong influence on outcomes.
Better ML Processes
All of these graph-AI systems produce value only when they’re in
production. The practice of MLOps (machine learning plus opera‐
tions) has grown up around managing the life cycle of AI systems,
but emphasizes the tasks of taking an ML model into production.
MLOps uses high levels of automation to ensure the quality of pro‐
duction ML and business and regulatory compliance.
It’s fitting that since graphs are a natural underlay for ML, they are
also a natural underlay for MLOps. In fact, a knowledge graph
makes an excellent AI system-of-truth, which gives a complete
62 | Chapter 5: Contextual AI
picture of the AI data. We can use a knowledge graph to collect and
dynamically track all the building blocks of an AI system: train/test
data, ML models, test results, production instances, and so on. With
this system-of-truth, a data scientist can use an existing predictive
model as a template for building a new model, quickly identify the
building blocks to change, and automate sourcing the building
blocks they intend to keep.
2 Using graphs to reduce predictive drift has been demonstrated in Jiaoyan Chen et al.,
“Knowledge Graph Embeddings for Dealing with Concept Drift in Machine Learning,”
Journal of Web Semantics 67 (February 2021).
Better ML Processes | 63
Finally, we need to consider how and where we should keep the
human in the loop, allowing a human user to change the outcome of
an event or process. Although there are many ways to incorporate
human interaction, knowledge graphs can help early in the process
when an ML model has been developed but not yet put into produc‐
tion. At this point in the process, we want to understand if the
model is making the correct predictions. An expert might test
hypotheses by exploring similar communities in the knowledge
graph or debug unexpected results by drilling into hierarchies and
dependencies with an organized graph. In this context, graphs pro‐
vide contextual information for investigations and counterfactual
analysis by domain experts.
Improving AI Reasoning
Organizations are using knowledge graphs to improve the reasoning
skills of AI itself by adding context to the decision process. Many AI
systems employ heuristic decision making, which uses a strategy to
find the most likely correct decision to avoid the high cost (time) of
processing lots of information. We can think of those heuristics as
shortcuts or rules of thumb that we would use to make fast deci‐
sions. For example, our hunter-gatherer ancestors would not have
deliberated long about a rustle in the bushes. Instead, they would
have quickly reacted to the situation as either a threat or opportu‐
nity based on heuristics like time of day and location.
Knowledge graphs help AI make better decisions by adding contex‐
tual information to their heuristic strategies. In a chatbot example,
our AI might have natural language understanding of words and
phrases. But to provide the best experience, we need to infer intent
based on short interactions with the user and update assumptions as
a dialogue progresses. In Figure 5-6 we can see how a chat layer, ML
models, and a knowledge layer can work together for continual
updates in a recommended architecture.
64 | Chapter 5: Contextual AI
Figure 5-6. Chatbot system architecture proposed by researchers3
3 Image is adapted from SoYeop Yoo and OkRan Jeong, “An Intelligent Chatbot Utilizing
BERT Model and Knowledge Graph”, Journal of Society for e-Business Studies 24, no. 3
(2019).
Improving AI Reasoning | 65
Figure 5-7. Representation of the LPL Financial subgraph used for a
financial chatbot
Figure 5-7 illustrates the graph data model with the addition of clar‐
ification steps that map AI questions to categories of options that are
valid for particular answers. LPL Financial reduced the amount of
detail needed in each step of the dialogue as well as paired down the
number of options the chatbot had as it progressed. As the AI sys‐
tem was used, the actioning knowledge graph was further updated
by weighting relationships based on how frequently they led to cor‐
rect responses. We can see how using a knowledge graph to boost
chatbots means faster answers and a more natural experience.
66 | Chapter 5: Contextual AI
Near-Future Breakthroughs
In a paper from DeepMind, Google Brain, MIT, and the University
of Edinburgh, “Relational inductive biases, deep learning, and
graph networks” (2018), researchers advocate for using a graph net‐
work, which “generalizes and extends various approaches for neural
networks that operate on graphs, and provides a straightforward
interface for manipulating structured knowledge and producing
structured behaviors.” These researchers believe that graphs have an
excellent ability to generalize about data structures, which broadens
the applicability and sophistication of AI systems.
The graph-network approach takes a graph as an input, performs
learning computations while preserving transient states, and then
returns a graph. This process allows the domain expert to review
and validate the learning path that leads to more explainable pre‐
dictions.
Using graph networks also enables whole-graph learning and multi‐
task predictions that reduce data requirements and automate the
identification of predictive features. This means richer and more
accurate predictions that use fewer data and training cycles.
The research concluded that graphs offer the next major advance‐
ment in ML, and we’re starting to see the groundwork of this
research in production. In fact, the graph embeddings and graph-
native ML covered in Chapter 4 are required first steps to build
such graph networks.
69
we can also support business-process optimization to improve the
bottom line, look for weaknesses in supply chains to test the resil‐
ience of the business when subjected to external shocks, perform
succession planning for staffing roles to ensure business continuity,
and perform a myriad of other tasks that perhaps only exist infor‐
mally today.
The enterprise digital twin provides key values:
Advanced Patterns
There are many ways a system can be compromised and, accord‐
ingly, there are many paths through the system we’d like to prohibit.
In a real system, we would likely map all ports and protocols
between systems, comparing them to our architecture as intended
(the organizing principle). We might also add in human operators
and their social patterns so that we can understand the extended
attack surface (compromised employees included) and rapidly
know who’s best placed to respond with either human intervention
or automation. By rapidly discovering illegal paths and deviations
in the graph representing the system, we uncover patterns of possi‐
ble misbehavior and can address them quickly.
In reality, the world isn’t so separated into such neat layers. The
computer servers that run this system are probably located near one
another,1 in the same data centers and possibly even in the same
rack, or as virtual machines on the same physical host. This means
they have things in common like switches, routers, cabinets, power
supplies, and cooling, to name but a few. Knowing this information,
we understand that a brownout causing an air conditioner to halt
can also cause disruption for a user thousands of miles away run‐
ning an app on their phone. And if we understand it, perhaps a
77
replace technical projects. Instead, we’ve shown that knowledge
graphs can be deployed as a nondisruptive technology sitting along‐
side existing systems while simultaneously enhancing their utility
and value.
The construction of knowledge graphs is more bazaar than cathe‐
dral. You don’t need grand designs to get started. In fact, we believe
that a disciplined, iterative approach is superior in many regards. So
start small, and solve a single important problem first.
When you’ve shown value with your first project, move onto the
next and be prepared to refactor some of the data. Knowledge
graphs are flexible, so such refactorings are not to be feared—they
are part and parcel of working with a living, valuable system. As you
find you need more tooling (e.g., support for ontologies or ontologi‐
cal languages and tooling), bring it into scope but only as much as
you need to solve the immediate problems.
Despite the years of academic research that underpin the approach,
knowledge graphs are new to many of us. We recommend that you
make best use of communities of practice, vendors, and subject mat‐
ter experts to get off the ground. Even though you may not need it,
your confidence will be bolstered in those critical early days.
Acknowledgments
It’s been a pleasure working on this material, and we thank all those
who assisted us. We’re incredibly grateful to Dr. Maya Natarajan for
her guidance and insights throughout the process, Gordon Camp‐
bell and Dr. Juan Sequeda for their invaluable feedback.