0% found this document useful (0 votes)
49 views

Nosql : Knowledge Engineering and Representation

The document discusses NoSQL databases and provides an overview of their categories, characteristics, and popular examples like MongoDB, Cassandra, Redis, and Neo4j. Key-value stores store each item as a key-value pair, document databases store flexible schema documents, graph databases store network-like graphs, and wide-column stores optimize for large datasets. Popular NoSQL databases are evaluated based on factors like mentions, searches, discussions, jobs, and social media to determine their relative popularity and ranking.

Uploaded by

valer
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Nosql : Knowledge Engineering and Representation

The document discusses NoSQL databases and provides an overview of their categories, characteristics, and popular examples like MongoDB, Cassandra, Redis, and Neo4j. Key-value stores store each item as a key-value pair, document databases store flexible schema documents, graph databases store network-like graphs, and wide-column stores optimize for large datasets. Popular NoSQL databases are evaluated based on factors like mentions, searches, discussions, jobs, and social media to determine their relative popularity and ranking.

Uploaded by

valer
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

IKT437

Knowledge Engineering and Representation

NoSQL ~ No SQL or Not Only SQL

Jan Pettersen Nytun, UiA


Overview
• Introduction and Motivation
• Categories of NoSQL
• Examples of NoSQL systems
• Encodings
• Querying
• Examples
• Summary

2
NOSQL – Comes in many different variants

Some Possible Characteristics


All characteristics may not be supported

• Non-relational
• Flexible schema
• Other or additional query languages than SQL
• Distributed – horizontal scaling
• Less structured data
• Supports big data

3
The Benefits of NoSQL
[https://www.mongodb.com/nosql-explained]

When compared to relational databases, NoSQL databases are


more scalable and provide superior performance, and their data
model addresses several issues that the relational model is not
designed to address:
• Geographically distributed architecture instead of expensive,
monolithic architecture
• Large volumes of rapidly changing structured, semi-
structured, and unstructured data
• Agile sprints, quick schema iteration, and frequent code
pushes
• Object-oriented programming that is easy to use and flexible 4
[ref: http://www.cs.tut.fi/~tjm/seminars/nosql2012/NoSQL-Intro.pdf]

5
Overview
• Introduction and Motivation
• Categories of NoSQL
• Examples of NoSQL systems
• Encodings
• Querying
• Examples
• Summary

6
NoSQL Database Types
[https://www.mongodb.com/nosql-explained]

• Graph stores are used to store information about networks of data, such as
social connections. Graph stores include Neo4J and triple stores like Fuseki.
• Document databases pair each key with a complex data structure known as a
document.

• Key-value stores are the simplest NoSQL databases. Every single item in the
database is stored as an attribute name (or 'key'), together with its value.
Examples of key-value stores are Riak and Berkeley DB.

• Wide-column stores such as Cassandra and HBase are optimized for queries
over large datasets, and store columns of data together, instead of rows.

7
Document Store
• The central concept is the notion of a "document“ which corresponds to a row
in RDBMS.

• A document comes in some standard formats like JSON (BSON).

• Documents are addressed in the database via a unique key that represents
that document.

• The database offers an API or query language that retrieves documents based
on their contents.

• Documents are schema free, i.e., different documents can have structures and
schema that differ from one another. (An RDBMS requires that each row
contain the same columns.)
8
MongoDB to documents (JSON):
{
_id: ObjectId("51156a1e056d6f966f268f81"),
type: "Article",
author: "Derick Rethans",
title: "Introduction to Document Databases with MongoDB",
date: ISODate("2013-04-24T16:26:31.911Z"),
body: "This arti…"
},
{
_id: ObjectId("51156a1e056d6f966f268f82"),
type: "Book",
author: "Derick Rethans",
title: "php|architect's Guide to Date and Time Programming with PHP",
isbn: "978-0-9738621-5-7"
}
9
What's the most popular NoSQL database?
[https://www.quora.com/Whats-the-most-popular-NoSQL-database]

Vadim Ismakaev, Co-Founder at GraceUpdated Apr 27, 2015

• Asking “what NoSQL database is the most popular” is a bit


incorrect since different problems require different types of
NoSQL solutions. …focus on solving very specific problems.
While this allows to achieve the best possible results in those
specific cases, it comes at a cost of some other functionalities.

10
So - what's the most popular NoSQL
database?

Top NoSQL Database Engines


by
http://
www.kdnuggets.com/2016/06/top-nosql-datab
ase-engines.html

Next Two Slides:

11
Method of calculating the scores of the DB-Engines Ranking
[http://db-engines.com/en/ranking_definition]

We measure the popularity of a system by using the following


parameters:

• Number of mentions of the system on websites, …


• General interest in the system. For this measurement, we use the frequency
of searches in Google Trends.
• Frequency of technical discussions about the system... Stack Overflow …
• Number of job offers, in which the system is mentioned...
• Number of profiles in professional networks, in which the system is
mentioned... LinkedIn …

• Relevance in social networks. We count the number of Twitter tweets, in


which the system is mentioned.

12
[http://www.kdnuggets.com/2016/06/top-nosql-database-engines.html]

Document databases: MongoDB


Wide-column stores: Cassandra and Hbase
key-value: Redis
Graph database: Neo4j

13
[http://db-engines.com/en/ranking_trend]

The DB-Engines Ranking ranks database management systems


according to their popularity – not only NOSQL databases

14
Neo4J
• Graph-oriented
• Implemented in Java and accessible from software written in other languages using the Cypher
query language through a transactional HTTP endpoint.
• ACID-compliant transactional database with native graph storage and processing.
• The most popular graph database.
• Everything is stored as an edge, a node or an attribute.
• Each node and edge can have any number of attributes.
• Both the nodes and edges can be labelled.
• Labels can be used to narrow searches.

15
Following Slides are copied from a presentation made by
Jim Webber

Neo4J
stole companion
from loves
loves appeared
enemy in
companion
appeared
in

appeared
in enemy

enemy
appeared appeared
A Good Man
in in Goes to War
Victory of
the Daleks

appeared
in
Property Graph Model
Property Graph Model

E L S_ WITH
TRAV
LOVES
LS _ WITH ED
TRAVE W 63
O
R 19
R :
BO ear
y
TRA
VELS
_ IN
Property Graph Model
name: the Doctor
age: 907
E L S_ WITH species: Time Lord
TRAV
LOVES
LS _ WITH ED
TRAVE W 63
O
R 19
first name: Rose R :
late name: Tyler BO ear
y
TRA
VELS
_ IN
vehicle: tardis
model: Type 40
Graphs are very whiteboard-friendly
What’s Neo4j?
• It’s is a Graph Database
• Embeddable and server
• Full ACID transactions
– don’t mess around with durability, ever.
• Schema free
More on Neo4j
• Neo4j is stable
– In 24/7 operation since 2003
• Neo4j is under active development
• High performance graph operations
– Traverses 1,000,000+ relationships / second on
commodity hardware
Neo4j Logical Architecture

Java Ruby … Clojure


REST API
JVM Language Bindings
Traversal Framework Graph Matching
Core API
Caches
Memory-Mapped (N)IO
Filesystem
Data access is programmatic
• Through the Java APIs
– JVM languages have bindings to the same APIs
• JRuby, Jython, Clojure, Scala…
• Managing nodes and relationships
• Indexing
• Traversing
• Path finding
• Pattern matching
Core API
• Deals with graphs in terms of their
fundamentals:
– Nodes
• Properties
– KV Pairs
– Relationships
• Start node
• End node
• Properties
– KV Pairs
Creating Nodes
GraphDatabaseService db = new
EmbeddedGraphDatabase("/tmp/neo");
Transaction tx = db.beginTx();
try {
Node theDoctor = db.createNode();
theDoctor.setProperty("character", "the
Doctor");
tx.success();
} finally {
tx.finish();
}
Creating Relationships
Transaction tx = db.beginTx();
try {
Node theDoctor = db.createNode();
theDoctor.setProperty("character", "The Doctor");

Node susan = db.createNode();


susan.setProperty("firstname", "Susan");
susan.setProperty("lastname", "Campbell");

susan.createRelationshipTo(theDoctor,

DynamicRelationshipType.withName("COMPANION_OF"));

tx.success();
} finally {
tx.finish();
}
Indexing a Graph?
• Graphs are their own indexes!
• But sometimes we want short-cuts to well-
known nodes
• Can do this in our own code
– Just keep a reference to any interesting nodes
Why graph matching?
• It’s super-powerful for looking for patterns in a
data set
– E.g. retail analytics
• Higher-level abstraction than raw traversers
– You do less work!

You might also like