Neo4j Notes
Neo4j Notes
Introduction
NoSQL is an approach to databases that represents a shift away from traditional relational
database management systems (RDBMS).
NoSQL can mean “not SQL” or “not only SQL.”
Relational databases rely on tables, columns, rows, or schemas to organize and retrieve data.
In contrast, NoSQL databases do not rely on these structures and use more flexible data
models.
Benefits
◦ Scalability
◦ Performance:
◦ High Availability
◦ Global Availability
◦ Flexible Data Modeling
Types of NoSQL Databases
Key-value data stores: data is represented as a collection of key-value pairs, such that each
possible key appears at most once in the collection (Eg: ArangoDB, InfinityDB, Oracle
NoSQL Database, Redis, and dbm)
Document Store: They are similar to key-value stores, but in this case, a value is a single
document that stores all data related to a specific key (Eg: XML, JSON, BSON)
Wide-column stores: group columns of related data together. A query can retrieve related
data in a single operation because only the columns associated with the query are retrieved
(Eg: Accumulo, Cassandra, Druid, HBase, Vertica, SAP HANA)
Graph Database: relations are well represented as a graph consisting of elements
interconnected with a finite number of relations between them (eg: AllegroGraph,
ArangoDB, InfiniteGraph, Apache Giraph, MarkLogic, Neo4J,)
Graph Databases
There are no isolated pieces of information, but rich, connected domains all around us.
Only a database that embraces relationships as a core aspect of its data model is able to
store, process, and query connections efficiently.
In relational databases, references to other rows and tables are indicated by referring to
their (primary-)key attributes via foreign-key columns. This is enforceable with constraints,
but only when the reference is never optional. Joins are computed at query time by matching
primary- and foreign-keys of the many (potentially indexed) rows of the to-be-joined tables.
These operations are compute- and memory-intensive and have an exponential cost.
In graph databases, relationships take first priority. This means your application doesn’t
have to infer data connections using things like foreign keys or out-of-band processing, such
as MapReduce. The data model for a graph database is also significantly simpler and more
expressive than those of relational or other NoSQL databases.
Graph databases are based on graph theory, and employ nodes, edges, and properties.
Nodes represent entities such as people, businesses, accounts, or any other item to be
tracked. They are roughly the equivalent of the record, relation, or row in a relational
database, or the document in a document database.
Edges, also termed graphs or relationships, are the lines that connect nodes to other
nodes; they represent the relationship between them. Meaningful patterns emerge
when examining the connections and interconnections of nodes, properties, and
edges. Edges are the key concept in graph databases, representing an abstraction that
is not directly implemented in other systems.
Properties are germane information that relate to nodes. For example, if Wikipedia
were one of the nodes, it might be tied to properties such as website, reference
material, or word that starts with the letter w, depending on which aspects of
Wikipedia are germane to a given database.
Comparison
Neo4j
Sponsored by Neo Technology, Neo4j is an open-source NoSQL graph database
implemented in Java and Scala.
Neo4j provides full database characteristics including ACID transaction compliance, cluster
support, and runtime failover, making it suitable to use graph data in production scenarios.
Use cases include matchmaking, network management, software analytics, scientific
research, routing, organizational and project management, recommendations, social
networks, and more.
Neo4j Customers:
Neo4j - Environment Setup
1. Make sure that Java 8 (or above) is installed. Otherwise install Java.
10. Connect using the username 'neo4j' with default password 'neo4j'. You'll then be prompted to
change the password. [ Please set ‘bdbl’ as new password]
Neo4j - Building Blocks
Node: Node is a fundamental unit of a Graph. It contains properties with key-value pairs.
Create Node
CREATE (node:label)
Modify
merge (n:Node {name: 'John'})
set n = {name: 'John', age: 34, coat: 'Yellow', hair: 'Brown'}
return n
(or)
Delete All
MATCH (n)
OPTIONAL MATCH (n)-[r]-()
DELETE n,r
Example
Relational Data Base
Graph Model:
Create Nodes
create (john:employee {eid:1000,name:'John',place:'Thrissur',yob:1988,salary:25000})
CREATE CONSTRAINT ON (emp:employee) ASSERT emp.eid IS UNIQUE
create (jacob:employee {eid:1001,name:'Jacob',place:'Bangalore',yob:1987,salary:20000})
create (harsh:employee {eid:1003,name:'Harsh',place:'Calicut',yob:1989,salary:20000})
create (dept:department{did:100,dname:'Research',dloc:'Chennai'})
CREATE CONSTRAINT ON (dep:department) ASSERT dep.did IS UNIQUE
create (dept2:department{did:101,dname:'Admin',dloc:'Trivandrum'})
create (pjt:project{pid:'p10',name:'P10'})
create (pjt2:project{pid:'p11',name:'P11'})
Add Relations
match (a:employee),(b:department) where a.eid=1000 and b.did=100 create (a)-[r:belongsTo]
->(b)return a,b
match (a:employee),(b:department) where a.eid=1001 and b.did=100 create (a)-[r:belongsTo]
->(b)return a,b
match (a:employee),(b:department) where a.eid=1003 and b.did=101 create (a)-[r:belongsTo]
->(b)return a,b
match (a:employee),(b:department) where a.eid=1004 and b.did=101 create (a)-[r:belongsTo]
->(b)return a,b
Query
1. List all employee names
match (e:employee) return e.name
2. List employees works in project P10
match (e:employee)-[:worksIn]->(p:project{pid:'p10'}) return e.name
Explore More....
1. Developers’ Manual, https://neo4j.com/developer/
2. Neo4j Tutorial: https://www.tutorialspoint.com/neo4j/
3. Ian Robinson, Graph Databases, O’Reilly Media, Inc., 2013 ( available at this link)
4. Ryan Boyd, Intro to Graph Databases Series [Video Lecture], ( available at this link)