Nosql : Knowledge Engineering and Representation
Nosql : Knowledge Engineering and Representation
2
NOSQL – Comes in many different variants
• Non-relational
• Flexible schema
• Other or additional query languages than SQL
• Distributed – horizontal scaling
• Less structured data
• Supports big data
3
The Benefits of NoSQL
[https://www.mongodb.com/nosql-explained]
5
Overview
• Introduction and Motivation
• Categories of NoSQL
• Examples of NoSQL systems
• Encodings
• Querying
• Examples
• Summary
6
NoSQL Database Types
[https://www.mongodb.com/nosql-explained]
• Graph stores are used to store information about networks of data, such as
social connections. Graph stores include Neo4J and triple stores like Fuseki.
• Document databases pair each key with a complex data structure known as a
document.
• Key-value stores are the simplest NoSQL databases. Every single item in the
database is stored as an attribute name (or 'key'), together with its value.
Examples of key-value stores are Riak and Berkeley DB.
• Wide-column stores such as Cassandra and HBase are optimized for queries
over large datasets, and store columns of data together, instead of rows.
7
Document Store
• The central concept is the notion of a "document“ which corresponds to a row
in RDBMS.
• Documents are addressed in the database via a unique key that represents
that document.
• The database offers an API or query language that retrieves documents based
on their contents.
• Documents are schema free, i.e., different documents can have structures and
schema that differ from one another. (An RDBMS requires that each row
contain the same columns.)
8
MongoDB to documents (JSON):
{
_id: ObjectId("51156a1e056d6f966f268f81"),
type: "Article",
author: "Derick Rethans",
title: "Introduction to Document Databases with MongoDB",
date: ISODate("2013-04-24T16:26:31.911Z"),
body: "This arti…"
},
{
_id: ObjectId("51156a1e056d6f966f268f82"),
type: "Book",
author: "Derick Rethans",
title: "php|architect's Guide to Date and Time Programming with PHP",
isbn: "978-0-9738621-5-7"
}
9
What's the most popular NoSQL database?
[https://www.quora.com/Whats-the-most-popular-NoSQL-database]
10
So - what's the most popular NoSQL
database?
11
Method of calculating the scores of the DB-Engines Ranking
[http://db-engines.com/en/ranking_definition]
12
[http://www.kdnuggets.com/2016/06/top-nosql-database-engines.html]
13
[http://db-engines.com/en/ranking_trend]
14
Neo4J
• Graph-oriented
• Implemented in Java and accessible from software written in other languages using the Cypher
query language through a transactional HTTP endpoint.
• ACID-compliant transactional database with native graph storage and processing.
• The most popular graph database.
• Everything is stored as an edge, a node or an attribute.
• Each node and edge can have any number of attributes.
• Both the nodes and edges can be labelled.
• Labels can be used to narrow searches.
15
Following Slides are copied from a presentation made by
Jim Webber
Neo4J
stole companion
from loves
loves appeared
enemy in
companion
appeared
in
appeared
in enemy
enemy
appeared appeared
A Good Man
in in Goes to War
Victory of
the Daleks
appeared
in
Property Graph Model
Property Graph Model
E L S_ WITH
TRAV
LOVES
LS _ WITH ED
TRAVE W 63
O
R 19
R :
BO ear
y
TRA
VELS
_ IN
Property Graph Model
name: the Doctor
age: 907
E L S_ WITH species: Time Lord
TRAV
LOVES
LS _ WITH ED
TRAVE W 63
O
R 19
first name: Rose R :
late name: Tyler BO ear
y
TRA
VELS
_ IN
vehicle: tardis
model: Type 40
Graphs are very whiteboard-friendly
What’s Neo4j?
• It’s is a Graph Database
• Embeddable and server
• Full ACID transactions
– don’t mess around with durability, ever.
• Schema free
More on Neo4j
• Neo4j is stable
– In 24/7 operation since 2003
• Neo4j is under active development
• High performance graph operations
– Traverses 1,000,000+ relationships / second on
commodity hardware
Neo4j Logical Architecture
susan.createRelationshipTo(theDoctor,
DynamicRelationshipType.withName("COMPANION_OF"));
tx.success();
} finally {
tx.finish();
}
Indexing a Graph?
• Graphs are their own indexes!
• But sometimes we want short-cuts to well-
known nodes
• Can do this in our own code
– Just keep a reference to any interesting nodes
Why graph matching?
• It’s super-powerful for looking for patterns in a
data set
– E.g. retail analytics
• Higher-level abstraction than raw traversers
– You do less work!