Social Network Security Record
Social Network Security Record
TECHNOLOGY
PALKULAM, KANYAKUMARI DISTRICT- 629 401
PRACTICAL RECORD
BONAFIDE CERTIFICATE
Certified that this is the Bonafide Record of work done by Ms. Mercy Fragrance J.G of
the III year, IV semester in Computer Science and Engineering department of this college,
in the Social Network Security (CCS363) laboratory in partial fulfilment of the requirement
of the B.E. Degree of the Anna University.
AIM:
To design a social media application.
PROCEDURE/ OUTPUT:
In this exercise, we will build a simple social network with realtime features and a list
of all members who are online.
We will be using Node.js as the application server, Vanilla JavaScript in the front end and
Pusher for realtime communication between our server and front end.
We will build an app which will be like your friends list or a common chat room where you
can see who's online and their latest status update in realtime. In the blog post, we will learn
about Pusher's presence channel and how to know about the online members to this channel.
We will be building the following components during this blog post:
Node.js server using ExpressJS framework:
/register API - In order to register/login a new user to our channel and server
by creating their session and saving their info
/isLoggedIn API - To check if a user is already logged in or not in case of
refreshing the browser
/usersystem/auth API - Auth validation done by Pusher after registering it with
our app and on subscribing to a presence or private channel
/logout API - To logout the user and remove the session
Frontend app using Vanilla JavaScript:
Register/Login Form - To register/login a new user by filling in their username
and initial status
Members List - To see everyone who is online and their updated status
Update Status - To click on the existing status and update it on blur of the
status text edit control
Signing up with Pusher
You can create a free account in Pusher. After you signup and login for the first time,
you will be asked to create a new app as seen in the picture below. You will have to fill in
some information about your project and also the front end library or backend language you
will be building your app with.
1
For this particular blog post, we will be selecting Vanilla JavaScript for the frontend and
Node.js for the backend as seen in the picture above. This will just show you a set of starter
sample codes for these selections, but you can use any integration kit later on with this app.
2
Node.js Server
Node.js should be installed in the system as a prerequisite to this. Now let us begin building
the Node.js server and all the required APIs using Express. Initialise a new node project by
the following command
npm init
Installing dependencies
We will be installing the required dependencies like Express, express-session, Pusher, body-
parser, cookie-parser by the following command:
npm install express express-session body-parser cookie-parser --save
Foundation server
We will now create the basic foundation for Node Server and also enable sessions in that
using express-session module.
3
// Error Handler for 404 Pages
app.use(function(req, res, next) {
var error404 = new Error('Route Not Found');
error404.status = 404;
next(error404);
});
module.exports = app;
app.listen(9000, function(){
console.log('Example app listening on port 9000!')
});
In the above code, we have created a basic Express server and using the
method .use we have enabled cookie-parser, body-parser and a static file serving
from public folder. We have also enabled sessions using express-session module. This will
enable us to save user information in the appropriate request session for the user.
Adding Pusher
Pusher has an open source NPM module for Node.js integrations which we will be using. It
provides a set of utility methods to integrate with Pusher APIs using a unique appId, key and
a secret. We will first install the Pusher npm module using the following command:
npm install pusher --save
Now, we can use 'require' to get the Pusher module and to create a new instance passing an
options object with important keys to initialise our integration. For this blog post, I have put
random keys; you will have to obtain it for your app from the Pusher dashboard.
var Pusher = require('pusher');
var pusher = new Pusher({
appId: '30XXX64',
key: '82XXXXXXXXXXXXXXXXXb5',
secret: '7bXXXXXXXXXXXXXXXX9e',
encrypted: true
});
var app = express();
4
...
You will have to replace the appId, key and a secret with values specific to your own app.
After this, we will write code for a new API which will be used to create a new comment.
Register/Login API
Now, we will develop the first API route of our application through which a new user can
register/login itself and make itself available on our app.
app.post('/register', function(req, res){
console.log(req.body);
if(req.body.username && req.body.status){
var newMember = {
username: req.body.username,
status: req.body.status
}
req.session.user = newMember;
res.json({
success: true,
error: false
});
}else{
res.json({
success: false,
error: true,
message: 'Incomplete information: username and status are required'
});
}
});
In the above code, we have exposed a POST API call on the route /register which would
expect username and status parameters to be passed in the request body. We will be saving
this user info in the request session.
User system auth API
5
In order to enable any client subscribing to Pusher Private and Presence channels, we need to
implement an auth API which would authenticate the user request by calling
Pusher.authenticate method at the server side. Add the following code in the server in order to
fulfil this condition:
6
}else{
res.send({ authenticated: false });
}
});
app.get('/logout', function(req,res){
if(req.session.user){
req.session.user = null;
}
res.redirect('/');
});
<!DOCTYPE>
<html>
<head>
<title>Whats Up ! Know what other's are up to in Realtime !</title>
<link rel="stylesheet" href="https://unpkg.com/[email protected]/build/pure-min.css"
integrity="sha384-
UQiGfs9ICog+LwheBSRCt1o5cbyKIHbwjWscjemyBMT9YCUMZffs6UqUTd0hObXD"
crossorigin="anonymous">
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Raleway:200">
7
<link rel="stylesheet" href="./style.css">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<body>
<header>
<div class="logo">
<img src="./assets/pusher-logo.png" />
</div>
<div id="logout" class="logout">
<a href="/logout">Logout</a>
</div>
</header>
<section class="subheader">
<img class="whatsup-logo" src="./assets/whatsup.png" />
<h2>Whats Up ! Know what other's are up to in Realtime !</h2>
</section>
<section>
<div id="loader" class="loader">
</div>
<script id="member-template" type="text/x-template">
</script>
<div id="me" class="me">
</div>
<div id="membersList" class="members-list">
</div>
<div id="signup-form" class="tab-content">
<div class="header">
<div><img src="./assets/comments.png"></div>
<div class="text">First Time Sign Up !</div>
8
</div>
<form class="pure-form" id="user-form">
<div class="signup-form">
<div class="left-side">
<div class="row">
<input type="text" required placeholder="enter a username or
displayname" id="display_name">
</div>
<div class="row">
<textarea placeholder="enter initial status text" required
id="initial_status" rows="3"></textarea>
</div>
</div>
<div class="right-side">
<button
type="submit"
class="button-secondary pure-button">Signup/Login</button>
</div>
</div>
</form>
</div>
</section>
<script src="https://js.pusher.com/4.0/pusher.min.js"></script>
<script type="text/javascript" src="./app.js"></script>
</body>
</html>
In the above boilerplate code, we have referenced our main Javascript file app.js and the
Pusher client side JS library. We also have a script tag where we will place the template for a
member row in the member list. Also, we have two empty div tags with ids me and
membersList to contain the logged in member name and info, as well as the list of all other
members with their statuses.
9
Step 3: Style.css
Important to note that we will be showing the signup form for the first time and the
MembersList and Logout button will be hidden by default initially. Please create a new file
called style.css and add the following css to it:
body{
margin:0;
padding:0;
overflow: hidden;
font-family: Raleway;
}
header{
background: #2b303b;
height: 50px;
width:100%;
display: flex;
color:#fff;
}
.loader,
.loader:after {
border-radius: 50%;
width: 10em;
height: 10em;
}
.loader {
margin: 60px auto;
font-size: 10px;
position: relative;
text-indent: -9999em;
10
border-top: 1.1em solid rgba(82,0,115, 0.2);
border-right: 1.1em solid rgba(82,0,115, 0.2);
border-bottom: 1.1em solid rgba(82,0,115, 0.2);
border-left: 1.1em solid #520073;
-webkit-transform: translateZ(0);
-ms-transform: translateZ(0);
transform: translateZ(0);
-webkit-animation: load8 1.1s infinite linear;
animation: load8 1.1s infinite linear;
}
@-webkit-keyframes load8 {
0% {
-webkit-transform: rotate(0deg);
transform: rotate(0deg);
}
100% {
-webkit-transform: rotate(360deg);
transform: rotate(360deg);
}
}
@keyframes load8 {
0% {
-webkit-transform: rotate(0deg);
transform: rotate(0deg);
}
100% {
-webkit-transform: rotate(360deg);
transform: rotate(360deg);
}
11
}
.subheader{
display: flex;
align-items: center;
margin: 0px;
}
.whatsup-logo{
height:60px;
border-radius: 8px;
flex:0 60px;
margin-right: 15px;
}
.logout{
flex:1;
justify-content: flex-end;
padding:15px;
display: none;
}
.logout a{
color:#fff;
text-decoration: none;
}
#signup-form{
display: none;
}
input, textarea{
width:100%;
}
section{
12
padding: 0px 15px;
}
.logo img{
height: 35px;
padding: 6px;
margin-left: 20px;
}
#updateStatus{
display: none;
}
.members-list{
display: none;
flex-direction: column;
}
.me {
display: none;
}
Step 4: Add app.js basic code
Now we will add our Javascript code to have basic utility elements inside a self invoking
function to create a private scope for our app variables. We do not want to pollute JS global
scope.
// Using IIFE for Implementing Module Pattern to keep the Local Space for the JS Variables
(function() {
// Enable pusher logging - don't include this in production
Pusher.logToConsole = true;
13
encrypted: true
}),
channel,
userForm = document.getElementById("user-form"),
memberTemplateStr = document.getElementById('member-template').innerHTML;
function showEle(elementId){
document.getElementById(elementId).style.display = 'flex';
}
function hideEle(elementId){
document.getElementById(elementId).style.display = 'none';
}
ajax(serverUrl+"isLoggedIn","GET",{},isLoginChecked);
function isLoginChecked(response){
var responseObj = JSON.parse(response);
14
if(responseObj.authenticated){
channel = pusher.subscribe('presence-whatsup-members');
bindChannelEvents(channel);
}
updateUserViewState(responseObj.authenticated);
}
function updateUserViewState(isLoggedIn){
document.getElementById("loader").style.display = "none";
if(isLoggedIn){
document.getElementById("logout").style.display = "flex";
document.getElementById("signup-form").style.display = "none";
}else{
document.getElementById("logout").style.display = "none";
document.getElementById("signup-form").style.display = "block";
}
}
function showLoader(){
document.getElementById("loader").style.display = "block";
document.getElementById("logout").style.display = "none";
document.getElementById("signup-form").style.display = "none";
}
function addNewMember(event){
15
event.preventDefault();
var newMember = {
"username": document.getElementById('display_name').value,
"status": document.getElementById('initial_status').value
}
showLoader();
ajax(serverUrl+"register","POST",newMember, onMemberAddSuccess);
}
function onMemberAddSuccess(response){
// On Success of registering a new member
console.log("Success: " + response);
userForm.reset();
updateUserViewState(true);
// Subscribing to the 'presence-members' Channel
channel = pusher.subscribe('presence-whatsup-members');
bindChannelEvents(channel);
}
})();
In the above code, we have referenced all the important variables we will be
requiring. We will also initialise the Pusher library using new Pusher and passing the api key
as the first argument. The second argument contains an optional config object in which we
will add the key authEndpoint with the custom node api route /usersystem/auth and also add
the key encrypted setting it to value true.
We will create a couple of generic functions to show or hide an element passing its unique id.
We have also added a common method named ajax to make ajax requests using XMLHttp
object in Vanilla JavaScript.
At the load of the page we make an ajax request to check if the user is logged in or not. If the
user is logged in, we will directly use the Pusher instance to subscribe the user to a presence
channel named presence-whatsup-members , you can have this as the unique chat room or
app location where you want to report/track the online members.
16
We have also written a method above to addNewMember using an ajax request to
the register api route we have built in Node.js. We will be passing the name and initial status
entered into the form.
We also have a method to update the user view state based on the logged in status. This
method does nothing but updates the visibility of members list, logout button and signup
form. We have used a bindChannelEvents method when the user is logged in which we will
be implementing later in the blog post.
Please add the following css in style.css file to display the me element appropriately with the
username and the status of the logged in user.
.me {
border:1px solid #aeaeae;
padding:10px;
margin:10px;
border-radius: 10px;
}
.me img{
height: 40px;
width: 40px;
}
.me .status{
padding:5px;
flex:1;
}
17
}
.me .status .text{
font-size: 15px;
width:100%;
-webkit-transition: all 1s ease-in 5ms;
-moz-transition: all 1s ease-in 5ms;
transition: all 1s ease-in 5ms;
}
function bindChannelEvents(channel){
channel.bind('client-status-update',statusUpdated);
var reRenderMembers = function(member){
renderMembers(channel.members);
}
channel.bind('pusher:subscription_succeeded', reRenderMembers);
channel.bind('pusher:member_added', reRenderMembers);
channel.bind('pusher:member_removed', reRenderMembers);
}
In the above bindChannelEvents method, we use the channel.bind method to bind event
handlers for 3 internal events
- pusher:subscription_succeeded, pusher:member_added, pusher:member_removed and 1
custom event - client-status-update.
Now we will add the JavaScript code to render the list of members. It is important to know
that the object which i returned from the .subscribe method has a property
18
called members which can be used to know the information about the logged in user referred
by the key me and other members by key members. Add the following code to app.js file
// Render the list of members with updated data and also render the logged in user component
function renderMembers(channelMembers){
var members = channelMembers.members;
var membersListNode = document.createElement('div');
showEle('membersList');
Object.keys(members).map(function(currentMember){
if(currentMember !== channelMembers.me.id){
var currentMemberHtml = memberTemplateStr;
currentMemberHtml = currentMemberHtml.replace('{{username}}',currentMember);
currentMemberHtml =
currentMemberHtml.replace('{{status}}',members[currentMember].status);
currentMemberHtml = currentMemberHtml.replace('{{time}}','');
var newMemberNode = document.createElement('div');
newMemberNode.classList.add('member');
newMemberNode.setAttribute("id","user-"+currentMember);
newMemberNode.innerHTML = currentMemberHtml;
membersListNode.appendChild(newMemberNode);
}
});
renderMe(channelMembers.me);
document.getElementById("membersList").innerHTML =
membersListNode.innerHTML;
}
19
function renderMe(myObj){
document.getElementById('myusername').innerHTML = myObj.id;
document.getElementById('mystatus').innerHTML = myObj.info.status;
}
We have added the event handler for new member add/remove event to re-render the
members list so that it remains updated with the online members only. In order to show the
members list we need to add the following style into our file style.css.
.member{
display: flex;
border-bottom: 1px solid #aeaeae;
margin-bottom: 10px;
padding: 10px;
}
.member .user-icon{
flex:0 40px;
display: flex;
align-items: center;
justify-content: center;
}
.member .user-info{
padding:5px;
20
margin-left:10px;
}
function sendStatusUpdateReq(event){
var newStatus = document.getElementById('mystatus').innerHTML;
var username = document.getElementById('myusername').innerText;
channel.trigger("client-status-update", {
username: username,
21
status: newStatus
});
}
Pusher : Error : {
"type":"WebSocketError",
"error":{
"type":"PusherError",
"data":
{
"code":null,
22
"message":"To send client events, you must enable this feature in the Settings page of
your dashboard."
}
}
}
We have built an application which will display all the online members for a particular
presence channel and their updates. If any of the online user updates their status, every user
will be notified about the updated status.
This component or code can be used for developing a social networking section in most of
the web apps these days. It is an important use case where the user needs to know about other
available participants. For example: an online classroom app can see the other participants
and the status can correspond to any question any participant wants to ask the presenter.
RESULT:
Thus a social media application has been designed and developed successfully.
23
EXP NO.2: CREATE A NETWORK MODEL USING NEO4J
AIM:
To create a network model using neo4j.
PROCEDURE:
Creating a network model graph in Neo4j typically involves generating a substantial
amount of data. Below is an example of a Cypher script that creates a network graph with
multiple nodes and relationships. This example creates a graph with nodes representing
employees and departments, where employees belong to departments:
24
// Creating nodes for employees
CREATE (:Employee {id: 1, name: 'Alice', position: 'Manager'})
CREATE (:Employee {id: 2, name: 'Bob', position: 'Engineer'})
CREATE (:Employee {id: 3, name: 'Charlie', position: 'Marketer'})
CREATE (:Employee {id: 4, name: 'David', position: 'Salesperson'})
CREATE (:Employee {id: 5, name: 'Eva', position: 'Analyst'})
CREATE (:Employee {id: 6, name: 'Frank', position: 'Engineer'})
CREATE (:Employee {id: 7, name: 'Grace', position: 'Manager'})
CREATE (:Employee {id: 8, name: 'Henry', position: 'Marketer'})
CREATE (:Employee {id: 9, name: 'Ivy', position: 'Analyst'})
CREATE (:Employee {id: 10, name: 'Jack', position: 'Salesperson'})
25
MATCH (n)
RETURN n
26
MATCH (e:Employee {id: 6}), (d:Department {name: 'Engineering'})
CREATE (e)-[:WORKS_IN]->(d)
27
RESULT:
Thus a network model using neo4j has been created successfully.
AIM:
To read and write data from a graph database in neo4j.
PROCEDURE:
Reading from Neo4j
Neo4j Connector for Apache Spark allows you to read data from a Neo4j instance in three
different ways:
By node labels
By relationship type
By Cypher® query
Reading all the nodes of label Person from your local Neo4j instance is as simple as this:
import org.apache.spark.sql.{SaveMode, SparkSession}
28
val spark = SparkSession.builder().getOrCreate()
spark.read.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("labels", "Person")
.load()
.show()
0 [Person] John 32
labels List of node labels separated by colon. The first (none) Yes*
label is to be the primary label.
29
Table 2. List of available read options
relationship.nodes.map If it’s set to true, source and target nodes are false No
returned as Map<String, String>, otherwise we
flatten the properties by returning every single
node property as column prefixed
by source or target
30
*
Just one of the options can be specified at the time.
Read data
Reading data from a Neo4j Database can be done in three ways:
Custom Cypher query
Node
Relationship
spark.read.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("query", "MATCH (n:Person) WITH n LIMIT 2 RETURN id(n) AS id, n.name AS
name")
.load()
.show()
id name
0 John Doe
1 Jane Doe
We recommend individual property fields to be returned, rather than returning graph entity (node,
relationship, and path) types. This best maps to Spark’s type system and yields the best results. So instead
of writing:
MATCH (p:Person) RETURN p
write the following:
MATCH (p:Person) RETURN id(p) AS id, p.name AS name.
31
If your query returns a graph entity, use the labels or relationship modes instead.
The structure of the Dataset returned by the query is influenced by the query itself. In this
particular context, it could happen that the connector isn’t able to sample the Schema from
the query, so in these cases, we suggest trying with the option schema.strategy set to string as
described here.
Read query must always return some data (read: must always have a return statement). If you use store
procedures, remember to YIELD and then RETURN data.
Script option
The script option allows you to execute a series of preparation script before Spark Job
execution, the result of the last query can be reused in combination with the query read mode
as it follows:
import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder().getOrCreate()
spark.read.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("script", "RETURN 'foo' AS val")
.option("query", "UNWIND range(1,2) as id RETURN id AS val, scriptResult[0].val AS
script")
.load()
.show()
Before the extraction from Neo4j starts, the connector runs the content of the script option
and the result of the last query is injected into the query.
val script
1 foo
2 foo
Schema
The first 10 (or any number specified by the schema.flatten.limit option) results are flattened
and the schema is created from those properties.
32
If the query returns no data, the sampling is not possible. In this case, the connector creates a
schema from the return statement, and every column is going to be of type String. This does
not cause any problems since you have no data in your dataset.
For example, you have this query:
MATCH (n:NON_EXISTENT_LABEL) RETURN id(n) AS id, n.name, n.age
The created schema is the following:
Column Type
id String
n.name String
n.age String
The returned column order is not guaranteed to match the RETURN statement for Neo4j 3.x and Neo4j
4.0.
Starting from Neo4j 4.1 the order is the same.
33
WITH p.name AS name
ORDER BY name
LIMIT 10
RETURN p.name
The queries return the exact same data, but only the second one is usable with the Spark
connector and partition-able, because of the WITH clause and the simple
final RETURN clause. If you choose to reformulate queries to use "internal SKIP/LIMIT",
take careful notice of ordering operations to guarantee the same result set.
You may also use the query.count option rather than reformulating your query
Node
You can read nodes by specifiying a single label, or multiple labels. Like so:
Single label
import org.apache.spark.sql.{SaveMode, SparkSession}
val spark = SparkSession.builder().getOrCreate()
spark.read.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("labels", "Person")
.load()
Multiple labels
import org.apache.spark.sql.{SaveMode, SparkSession}
val spark = SparkSession.builder().getOrCreate()
spark.read.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("labels", "Person:Customer:Confirmed")
.load()
Label list can be specified both with starting colon or without it:
Person:Customer and :Person:Customer are considered the same thing.
Columns
When reading data with this method, the DataFrame contains all the fields contained in the
nodes, plus two additional columns.
<id> the internal Neo4j ID
34
<labels> a list of labels for that node
Schema
If APOC is available, the schema is created with apoc.meta.nodeTypeProperties. Otherwise,
we execute the following Cypher query:
MATCH (n:<labels>)
RETURN n
ORDER BY rand()
LIMIT <limit>
Where <labels> is the list of labels provided by labels option and <limit> is the value
provided by schema.flatten.limit option. The results of such query are flattened, and the
schema is created from those properties.
Example
CREATE (p1:Person {age: 31, name: 'Jane Doe'}),
(p2:Person {name: 'John Doe', age: 33, location: null}),
(p3:Person {age: 25, location: point({latitude: -37.659560, longitude: -68.178060})})
The following schema is created:
Field Type
<id> Int
<labels> String[]
age Int
name String
location Point
Relationship
To read a relationship you must specify the relationship type, the source node labels, and the
target node labels.
import org.apache.spark.sql.{SaveMode, SparkSession}
val spark = SparkSession.builder().getOrCreate()
spark.read.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
35
.option("relationship", "BOUGHT")
.option("relationship.source.labels", "Person")
.option("relationship.target.labels", "Product")
.load()
This creates the following Cypher query:
MATCH (source:Person)-[rel:BOUGHT]->(target:Product)
RETURN source, rel, target
Node mapping
The result format can be controlled by the relationship.nodes.map option (default is false).
When it is set to false, source and target nodes properties are returned in separate columns
prefixed with source. or target. (i.e., source.name, target.price).
When it is set to true, the source and target nodes properties are returned as Map[String,
String] in two columns named source and target.
Nodes map set to false
import org.apache.spark.sql.{SaveMode, SparkSession}
val spark = SparkSession.builder().getOrCreate()
spark.read.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("relationship", "BOUGHT")
.option("relationship.nodes.map", "false")
.option("relationship.source.labels", "Person")
.option("relationship.target.labels", "Product")
.load()
.show()
36
.option("relationship", "BOUGHT")
.option("relationship.nodes.map", "true")
.option("relationship.source.labels", "Person")
.option("relationship.target.labels", "Product")
.load()
.show()
4 BOUGHT 240 { {
"fullName": "John Doe", "name": "Product 1",
"id": 1, "id": 52,
"<labels>: "[Person]", "<labels>: "[Product]",
"<id>": 1 "<id>": 0
} }
4 BOUGHT 145 { {
"fullName": "Jane Doe", "name": "Product 2",
"id": 1, "id": 53,
"<labels>: "<labels>: "[Product]",
"[Person]", "<id>": 2
"<id>": 3 }
}
Columns
When reading data with this method, the DataFrame contains the following columns:
<id> the internal Neo4j ID.
<relationshipType> the relationship type.
rel.[property name] relationship properties.
Depending on the value of relationship.nodes.map option.
If true:
source the Map<String, String> of source node
37
target the Map<String, String> of target node
If false:
<sourceId> the internal Neo4j ID of source node
<sourceLabels> a list of labels for source node
<targetId> the internal Neo4j ID of target node
<targetLabels> a list of labels for target node
source.[property name] source node properties
target.[property name] target node properties
Filtering
You can use Spark to filter properties of the relationship, the source node, or the target node.
Use the correct prefix:
If relationship.nodes.map is set to false:
`source.[property]` for the source node properties.
`rel.[property]` for the relationship property.
`target.[property]` for the target node property.
import org.apache.spark.sql.{SaveMode, SparkSession}
val df = spark.read.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("relationship", "BOUGHT")
.option("relationship.nodes.map", "false")
.option("relationship.source.labels", "Person")
.option("relationship.target.labels", "Product")
.load()
38
`<source>`.`[property]` for the source node map properties.
`<rel>`.`[property]` for the relationship map property.
`<target>`.`[property]` for the target node map property.
In this case, all the map values are to be strings, so the filter value must be a string too.
import org.apache.spark.sql.{SaveMode, SparkSession}
val df = spark.read.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("relationship", "BOUGHT")
.option("relationship.nodes.map", "true")
.option("relationship.source.labels", "Person")
.option("relationship.target.labels", "Product")
.load()
39
Writing to Neo4j
Let’s look at the following code sample:
import org.apache.spark.sql.{SaveMode, SparkSession}
import scala.util.Random
val spark = SparkSession.builder().getOrCreate()
import spark.implicits._
case class Point3d(`type`: String = "point-3d",
srid: Int,
x: Double,
y: Double,
z: Double)
case class Person(name: String, surname: String, age: Int, livesIn: Point3d)
val total = 10
val rand = Random
val ds = (1 to total)
.map(i => {
Person(name = "Andrea " + i, "Santurbano " + i, rand.nextInt(100),
Point3d(srid = 4979, x = 12.5811776, y = 41.9579492, z = 1.3))
}).toDS()
ds.write
.format("org.neo4j.spark.DataSource")
.mode(SaveMode.ErrorIfExists)
.option("url", "bolt://localhost:7687")
.option("labels", ":Person:Customer")
.save()
The above code inserts 10 nodes into Neo4j via Spark, and each of them has:
Two labels: Person and Customer.
Four properties: name, surname, age, and livesIn.
40
Save mode
To persist data into Neo4j, the Spark Connector supports two save modes that work only
if UNIQUE or NODE KEY constraints are defined in Neo4j for the given properties.
The SaveMode examples apply to the Scala class org.apache.spark.sql.SaveMode. For PySpark, use a
static string with the name of the SaveMode. So instead of SaveMode.Overwrite, use Overwrite for
PySpark.
For SaveMode.Overwrite mode, you need to have unique constraints on the keys.
If you are using Spark 3, the default save mode ErrorIfExists does not work, use Append instead.
For SaveMode.Overwrite mode, you need to have unique constraints on the keys.
41
Table 1. List of available write options
Neo4j Connector for Apache Spark provides batch writes to speed up the ingestion process, so if the
process at some point fails, all the previous data are already persisted.
Write data
42
Writing data to a Neo4j database can be done in three ways:
Custom Cypher® query
Node
Relationship
Custom Cypher query
In case you use the option query, the Spark Connector persists the entire Dataset by using the
provided query. The nodes are sent to Neo4j in a batch of rows defined in
the batch.size property, and your query is wrapped up in an UNWIND $events AS
event statement.
Let’s look at the following simple Spark program:
import org.apache.spark.sql.{SaveMode, SparkSession}
val spark = SparkSession.builder().getOrCreate()
import spark.implicits._
val df = (1 to 10)/*...*/.toDF()
df.write
.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("query", "CREATE (n:Person {fullName: event.name + event.surname})")
.save()
This generates the following query:
UNWIND $events AS event
CREATE (n:Person {fullName: event.name + event.surname})
Thus events is the batch created from your dataset.
Node
In case you use the option labels, the Spark Connector persists the entire dataset as nodes.
Depending on the SaveMode, it is going to CREATE or MERGE nodes (in the last case
the node.keys properties are being used).
The nodes are sent to Neo4j in a batch of rows defined in the batch.size property, and
an UNWIND operation is performed under the hood.
Let’s remember the first example in this chapter:
ErrorIfExists mode
43
import org.apache.spark.sql.{SaveMode, SparkSession}
import scala.util.Random
val spark = SparkSession.builder().getOrCreate()
import spark.implicits._
case class Point3d(`type`: String = "point-3d",
srid: Int,
x: Double,
y: Double,
z: Double)
case class Person(name: String, surname: String, age: Int, livesIn: Point3d)
val total = 10
val rand = Random
val df = (1 to total)
.map(i => {
Person(name = "Andrea " + i, "Santurbano " + i, rand.nextInt(100),
Point3d(srid = 4979, x = 12.5811776, y = 41.9579492, z = 1.3))
}).toDF()
df.write
.format("org.neo4j.spark.DataSource")
.mode(SaveMode.ErrorIfExists)
.option("url", "bolt://localhost:7687")
.option("labels", ":Person:Customer")
.save()
The above code is converted in a similar Cypher query:
UNWIND $events AS event
CREATE (n:`Person`:`Customer`) SET n += event.properties
The following example of how to use the same DataFrame and save it in Overwrite mode:
Overwrite mode
import org.apache.spark.sql.{SaveMode, SparkSession}
44
val spark = SparkSession.builder().getOrCreate()
import spark.implicits._
val df = (1 to 10)/*...*/.toDF()
df.write
.format("org.neo4j.spark.DataSource")
.mode(SaveMode.Overwrite)
.option("url", "bolt://localhost:7687")
.option("labels", ":Person:Customer")
.option("node.keys", "name,surname")
.save()
The code above generates the following Cypher query:
UNWIND $events AS event
MERGE (n:`Person`:`Customer` {name: event.keys.name, surname: event.keys.surname})
SET n += event.properties
You must specify, which columns of your DataFrame are used as keys to match the nodes.
You control this with the option node.keys, specifying a comma-separated list
of key:value pairs, where the key is the DataFrame column name and the value is the node
property name.
If key and value are the same field, you can specify one without the colon. For example, if you
have .option("node.keys", "name:name,email:email"), you can also write .option("node.keys",
"name,email").
In case the column value is a Map<String, Value> (where Value can be any supported Neo4j
Type), the connector automatically tries to flatten it.
Let’s consider the following dataset:
id name lives_in
1 Andrea Santurbano {address: 'Times Square, 1', city: 'NY', state: 'NY'}
2 Davide Fantuzzi {address: 'Statue of Liberty, 10', city: 'NY', state: 'NY'}
Neo4j Connector for Apache Spark flattens the maps, and each map value is in it’s own
property.
45
id name lives_in.address lives_in.city lives_in.state
Relationship
You can write a DataFrame to Neo4j by specifying source, target nodes, and relationships.
Overview
Before diving into the actual process, let’s clarify the vocabulary first. Since this method of
writing data to Neo4j is more complex and few combinations of options can be used, let’s
spend more time on explaining it.
In theory you should take your dataset and move the columns around to create source and
target nodes, eventually creating the specified relationships between them.
This is a basic example of what would happen:
UNWIND $events AS event
CREATE (source:Person)
SET source = event.source
CREATE (target:Product)
SET target = event.target
CREATE (source)-[rel:BOUGHT]->(target)
SET rel += event.rel
The CREATE keyword for the source and target nodes can be replaced
by MERGE or MATCH. To control this you can use the Node save modes.
You can set source and target nodes independently by
using relationship.source.save.mode or relationship.target.save.mode.
These options accept a case insensitive string as a value, that can be one
of ErrorIfExists, Overwrite, Append; they work in the same same way as the Node save
modes.
When using MATCH or MERGE, you need to specify keys that identify the nodes. This is
what the options relationship.source.node.keys and relationship.target.node.keys. More on
this here.
The CREATE keyword for the relationship can be replaced by a MERGE. You can control
this with Save mode.
You are also required to specify one of the two Save Strategies. This identifies which method
is to be used to create the Cypher query and can have additional options available.
46
Save strategies
There are two strategies you can use to write relationships: Native (default strategy)
and Keys.
Native strategy
The Native strategy is useful when you have a schema that conforms with the Relationship
read schema, and the relationship.nodes.map set to false.
If you want to read relationship from a database, filter data, and write the result to another
database, you can refer to the following example:
import org.apache.spark.sql.{SaveMode, SparkSession}
47
You just need to specify the source node labels, the target node labels, and the relationship
you want between them.
The generated query is the following:
UNWIND $events AS event
CREATE (source:Person:Rich)
SET source = event.source
CREATE (target:Product:Expensive)
SET target = event.target
CREATE (source)-[rel:BOUGHT]->(target)
SET rel += event.rel
event.source, event.target, and event.rel
The default save mode for source and target nodes is Match. That means that the relationship can be
created only if the nodes are already in your database.
When using Overwrite or Match node save mode, you should specify which keys should be
used to identify the nodes.
import org.apache.spark.sql.{SaveMode, SparkSession}
val spark = SparkSession.builder().getOrCreate()
// we read our DF from Neo4j using the relationship method
val df = spark.read.format("org.neo4j.spark.DataSource")
.option("url", "bolt://first.host.com:7687")
.option("relationship", "BOUGHT")
.option("relationship.nodes.map", "false")
.option("relationship.source.labels", "Person")
.option("relationship.target.labels", "Product")
.load()
df.write
.format("org.neo4j.spark.DataSource")
.option("url", "bolt://second.host.com:7687")
.option("relationship", "SOLD")
.option("relationship.source.labels", ":Person:Rich")
48
.option("relationship.source.save.mode", "Overwrite")
.option("relationship.source.node.keys", "source.fullName:fullName")
.option("relationship.target.labels", ":Product:Expensive")
.option("relationship.target.save.mode", "Overwrite")
.option("relationship.target.node.keys", "target.id:id")
.save()
You must specify which columns of your DataFrame are being used as keys to match the
nodes. You control this with the
options relationship.source.node.keys and relationship.target.node.keys, specifying a comma-
separated list of key:value pairs, where the key is the DataFrame column name, and the value
is the node property name.
The generated query is the following:
UNWIND $events AS event
MERGE (source:Person:Rich {fullName: event.source.fullName})
SET source = event.source
MERGE (target:Product:Expensive {id: event.target.id})
SET target = event.target
CREATE (source)-[rel:BOUGHT]->(target)
SET rel += event.rel
Remember that you can choose to CREATE or MERGE the relationship with the Save mode.
If the provided DataFrame schema doesn’t conform to the required schema, meaning that none of the
required columns is present, the write fails.
Keys strategy
When you want more control over the relationship writing, you can use the Keys strategy.
As in the case of using the Native strategy, you can specify node keys to identify nodes. In
addition, you can also specify which columns should be written as nodes properties.
Specify keys
import org.apache.spark.sql.{SaveMode, SparkSession}
val spark = SparkSession.builder().getOrCreate()
import spark.implicits._
val musicDf = Seq(
49
(12, "John Bonham", "Drums"),
(19, "John Mayer", "Guitar"),
(32, "John Scofield", "Guitar"),
(15, "John Butler", "Guitar")
).toDF("experience", "name", "instrument")
musicDf.write
.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("relationship", "PLAYS")
.option("relationship.save.strategy", "keys")
.option("relationship.source.labels", ":Musician")
.option("relationship.source.save.mode", "overwrite")
.option("relationship.source.node.keys", "name:name")
.option("relationship.target.labels", ":Instrument")
.option("relationship.target.node.keys", "instrument:name")
.option("relationship.target.save.mode", "overwrite")
.save()
This creates a MERGE query using name property as key for Musician nodes. The value
of instrument column is used as a value for Instrument property name, generating a statement
like:
MERGE (target:Instrument {name: event.target.instrument}).
Here you must specify which columns of your DataFrame will be written in the source node
and in the target node properties. You can do this with the
options relationship.source.node.properties and relationship.target.node.properties, specifying
a comma-separated list of key:value pairs, where the key is the DataFrame column name, and
the value is the node property name.
Same applies to relationship.properties option, used to specify which DataFrame columns are
written as relationship properties.
If key and value are the same field you can specify one without the colon. For example, if you
have .option("relationship.source.node.properties", "name:name,email:email"), you can also
write .option("relationship.source.node.properties", "name,email"). Same applies
for relationship.source.node.keys and relationship.target.node.keys.
50
Specify properties and keys
import org.apache.spark.sql.{SaveMode, SparkSession}
val spark = SparkSession.builder().getOrCreate()
import spark.implicits._
val musicDf = Seq(
(12, "John Bonham", "Orange", "Drums"),
(19, "John Mayer", "White", "Guitar"),
(32, "John Scofield", "Black", "Guitar"),
(15, "John Butler", "Wooden", "Guitar")
).toDF("experience", "name", "instrument_color", "instrument")
musicDf.write
.format("org.neo4j.spark.DataSource")
.option("url", "bolt://localhost:7687")
.option("relationship", "PLAYS")
.option("relationship.save.strategy", "keys")
.option("relationship.source.labels", ":Musician")
.option("relationship.source.save.mode", "overwrite")
.option("relationship.source.node.keys", "name:name")
.option("relationship.target.labels", ":Instrument")
.option("relationship.target.node.keys", "instrument:name")
.option("relationship.target.node.properties", "instrument_color:color")
.option("relationship.target.save.mode", "overwrite")
.save()
Node save modes
You can specify four different modes for saving the nodes:
Overwrite mode performs a MERGE on that node.
ErrorIfExists mode performs a CREATE (not available for Spark 3).
Append mode performs a CREATE (not available for Spark 2.4).
Match mode performs a MATCH.
51
For Overwrite mode you must have unique constraints on the keys.
52
.option("url", SparkConnectorScalaSuiteIT.server.getBoltUrl)
.option("labels", ":Person:Customer")
.option("node.keys", "surname")
.option("schema.optimization.type", "NODE_CONSTRAINTS")
.save()
Before the import starts, the code above creates the following schema query:
CREATE CONSTRAINT FOR (p:Person) REQUIRE (p.surname) IS UNIQUE
Take into consideration that the first label is used for the index creation.
Script option
The script option allows you to execute a series of preparation script before Spark Job
execution. The result of the last query can be reused in combination with the query ingestion
mode as it follows:
val ds = Seq(SimplePerson("Andrea", "Santurbano")).toDS()
ds.write
.format(classOf[DataSource].getName)
.mode(SaveMode.ErrorIfExists)
.option("url", SparkConnectorScalaSuiteIT.server.getBoltUrl)
.option("query", "CREATE (n:Person{fullName: event.name + ' ' + event.surname, age:
scriptResult[0].age})")
.option("script",
"""CREATE INDEX person_surname FOR (p:Person) ON (p.surname);
|CREATE CONSTRAINT product_name_sku FOR (p:Product)
| REQUIRE (p.name, p.sku)
| IS NODE KEY;
|RETURN 36 AS age;
|""".stripMargin)
.save()
Before the import starts, the connector runs the content of the script option, and the result of
the last query is injected into the query. At the end the full query executed by the connector
while the data is being ingested is the following:
WITH $scriptResult AS scriptResult
53
UNWIND $events AS event
CREATE (n:Person{fullName: event.name + ' ' + event.surname, age: scriptResult[0].age})
scriptResult is the result from the last query contained within the script options that
is RETURN 36 AS age;
Note about columns with Map type
When a Dataframe column is a map, what we do internally is to flatten the map as Neo4j
does not support this type for graph entity properties; so for a Spark job like this:
val data = Seq(
("Foo", 1, Map("inner" -> Map("key" -> "innerValue"))),
("Bar", 2, Map("inner" -> Map("key" -> "innerValue1"))),
).toDF("id", "time", "table")
data.write
.mode(SaveMode.Append)
.format(classOf[DataSource].getName)
.option("url", SparkConnectorScalaSuiteIT.server.getBoltUrl)
.option("labels", ":MyNodeWithFlattenedMap")
.save()
In Neo4j for the nodes with label MyNodeWithFlattenedMap you’ll find this information
stored:
MyNodeWithFlattenedMap {
id: 'Foo',
time: 1,
`table.inner.key`: 'innerValue'
}
MyNodeWithFlattenedMap {
id: 'Bar',
time: 1,
`table.inner.key`: 'innerValue1'
}
Now you could fall into problematic situations like the following one:
54
val data = Seq(
("Foo", 1, Map("key.inner" -> Map("key" -> "innerValue"), "key" -> Map("inner.key" ->
"value"))),
("Bar", 1, Map("key.inner" -> Map("key" -> "innerValue1"), "key" -> Map("inner.key" ->
"value1"))),
).toDF("id", "time", "table")
data.write
.mode(SaveMode.Append)
.format(classOf[DataSource].getName)
.option("url", SparkConnectorScalaSuiteIT.server.getBoltUrl)
.option("labels", ":MyNodeWithFlattenedMap")
.save()
since the resulting flattened keys are duplicated, the Neo4j Spark will pick one of the
associated value in a non-deterministic way.
Because the information that we’ll store into Neo4j will be this (consider that the order is not
guaranteed):
MyNodeWithFlattenedMap {
id: 'Foo',
time: 1,
`table.key.inner.key`: 'innerValue' // but it could be `value` as the order is not guaranteed
}
MyNodeWithFlattenedMap {
id: 'Bar',
time: 1,
`table.key.inner.key`: 'innerValue1' // but it could be `value1` as the order is not guaranteed
}
Group duplicated keys to array of values
You can use the option schema.map.group.duplicate.keys to avoid this problem. The
connector will group all the values with the same keys into an array. The default value for the
option is false. In a scenario like this:
val data = Seq(
55
("Foo", 1, Map("key.inner" -> Map("key" -> "innerValue"), "key" -> Map("inner.key" ->
"value"))),
("Bar", 1, Map("key.inner" -> Map("key" -> "innerValue1"), "key" -> Map("inner.key" ->
"value1"))),
).toDF("id", "time", "table")
data.write
.mode(SaveMode.Append)
.format(classOf[DataSource].getName)
.option("url", SparkConnectorScalaSuiteIT.server.getBoltUrl)
.option("labels", ":MyNodeWithFlattenedMap")
.option("schema.map.group.duplicate.keys", true)
.save()
the output would be:
MyNodeWithFlattenedMap {
id: 'Foo',
time: 1,
`table.key.inner.key`: ['innerValue', 'value'] // the order is not guaranteed
}
MyNodeWithFlattenedMap {
id: 'Bar',
time: 1,
`table.key.inner.key`: ['innerValue1', 'value1'] // the order is not guaranteed
}
56
RESULT:
Thus the reading and writing data from the graph database have been performed
successfully.
AIM:
To find friend of friends using neo4j.
PROCEDURE/ OUTPUT:
In a retail scenario (either online or brick-and-mortar), we could store the baskets that
customers have purchased in a graph like the one below.
This graph shows how we use a simple linked list of shopping baskets connected
by NEXT relationships to create a purchase history for the customer.
57
In the graph above, we see that the customer has visited three times, saved their first purchase
for later (the SAVED relationship between customer and basket nodes).
Ultimately, the customer bought one basket (indicated by the BOUGHT relationship between
customer and basket node) and is currently assembling a basket, shown by
the CURRENT relationship that points to an active basket at the head of the linked list.
It’s important to understand this isn’t a schema or an entity-relationship (ER) diagram but
represents actual data for a single customer. A real graph of many such customers would be
huge (far too big for examples in a blog) but would exhibit the same kind of structure.
In graph form, it’s easy to figure out the customer’s behavior: They became a (potential) new
customer but failed to commit to buying toothpaste and came back one day later and bought
toothpaste, bread and butter. Finally, the customer settled on buying bread and butter in their
next purchase – which is a repeated pattern in their purchase history we could ultimately use
to serve them better.
Now that we have a graph of customers, and the past products they’ve bought, we think about
recommendations to influence their future buying behavior.
By far, the simplest recommendation is to show popular products across the store. This is
trivial in Cypher as we see in the following query:
First, the MATCH clause shows how ASCII-art is used to declare the graph structure (or
pattern) that we’re looking for. In this case, it can be read as “customers who bought a basket
that had a product in it” except since baskets aren’t particularly important for this query
we’ve elided them using the anonymous node ().
Then we RETURN the data that matched the pattern and operate on it with some (familiar
looking) aggregate functions. That is, we return the node representing the product(s) and the
count of how many product nodes matched, then order by the number of nodes that matched
in a descending fashion. We’ve also limited the returns to the top five, which gives us the
58
most popular products in the purchasing data.
However, this query isn’t really contextualized by the customer but by all customers and so
isn’t optimized for any given individual (though it might be very useful for supply chain
management). We do better without much additional work by recommending historically
popular purchases that the customer has made themselves, as in the following query:
The only change in this query, compared to the previous one, is the inclusion of a constraint
on the customer node that it must contain a key name and a value Alice. This is actually a far
better query from the customer’s point of view since it’s egocentric (as good
recommendations should be!).
Of course, in an age of social selling it’d be even better to show the customer popular
products in their social network rather than just their own purchases since this strongly
influences buying behavior.
As you’d expect, adding a social dimension to a Neo4j graph database is easy, and querying
for friends/friends-of-friends/neighbors/colleagues or other demographics is straightforward
as in this query:
To retrieve the purchased products of both direct friends and friends-of-friends, we use the
Cypher WITH clause to divide the query into two logical parts, piping results from the first
part into the second. In the first part of this query, we see the family syntax where we find the
current customer (Alice) and traverse the graph matching for either Alice’s direct friends or
their friends (her friends-of-friends).
59
This is a straightforward query since Neo4j supports a flexible path-length notation, like so: -
[:FRIEND*1..2]-> which means one or two FRIEND relationships deep. In this case, we get
all friends (depth one) and friend-of-friends (at depth two), but the notation can be
parameterized for any maximum and minimum depth.
In matching, we must take care not to include Alice herself in the results (because your
friend’s friend is you!). It is the WHERE clause, which ensures there is only a match when
the customer and candidate friend are not the same node.
We don’t want to get duplicate friends-of-friends that are also direct friends (which often
happens in groups of friends). Using the DISTINCT keyword ensures that we don’t get
duplicate results from equivalent pattern matches.
Once we have the friends and friends-of-friends of the customer, the WITH clause pipes the
results from the first part of the query into the second. In the second half of the query, we’re
back in familiar territory, matching against customers (the friends and friends-of-friends) who
bought products and ranking them by sales (the number of bought baskets each product
appeared in).
RESULT:
60
Thus the finding of friend of friends have been implemented in the neo4j successfully.
AIM:
To implement secure search in social media.
PROCEDURE:
Implementing secure search in social media platforms represents a multifaceted
endeavor, integrating various technical, procedural, and regulatory measures to fortify user
privacy and data security while ensuring seamless search experiences. Here’s an in-depth
exploration on the implementation of secure search:
1. Understanding the Need for Secure Search in Social Media
The prevalence of social media as a communication and information-sharing platform
necessitates robust security measures to protect user data amid evolving threats. Secure
search serves as a critical facet in this landscape, ensuring that user queries, interactions, and
data remain confidential, shielding against unauthorized access and misuse.
2. Foundational Components of Secure Search Implementation
a. Encryption Protocols and Data Transmission Security
Introduction of SSL/TLS encryption to secure data transmission between user devices
and servers, preventing unauthorized access to search queries and results during
transit.
b. User Authentication and Access Controls
Implementation of strong authentication mechanisms (OAuth, JWT) to verify user
identities before allowing access to search functionalities.
Integration of access controls and role-based permissions to ensure authorized access
to search results and functionalities.
c. Data Anonymization and Pseudonymization
Removal of personally identifiable information (PII) from search query logs and
indexes using anonymization techniques to protect user identities.
Utilization of pseudonyms or tokens to represent user identities in search logs or
indexes, preserving anonymity while allowing for analytics.
61
d. Secure Indexing and Tokenization Practices
Tokenization of search queries to break them into tokens or keywords before
indexing, ensuring efficient search functionalities while securing indexes from
unauthorized access.
3. Advanced Techniques and Algorithms for Enhanced Security
a. Homomorphic Encryption for Secure Computations
Exploration and adoption of homomorphic encryption techniques to perform
computations on encrypted search indexes without decryption, ensuring data privacy
during computations.
b. Secure Multi-Party Computation (MPC)
Implementation of MPC protocols, enabling multiple parties to jointly compute search
results without revealing individual inputs, ensuring privacy during computation
processes.
4. Privacy Policies, User Consent, and Transparency
a. Explicit User Consent and Data Handling Policies
Transparent communication of privacy policies, obtaining explicit user consent for
collecting, storing, and utilizing search-related data, fostering user trust.
b. Data Minimization and Compliance with Regulations
Adherence to stringent data protection regulations (GDPR, CCPA), minimizing data
collection to only essential information necessary for search functionalities.
5. Continuous Improvement and Compliance Measures
a. Regular Security Audits and Monitoring
Conducting periodic security audits to identify vulnerabilities and address potential
threats within search functionalities.
Continuous monitoring of security protocols and compliance with data protection
regulations to ensure adherence and mitigate risks.
6. User Education and Empowerment
a. User Training on Secure Search Practices
Education initiatives to familiarize users with secure search practices, privacy
settings, and the platform's security features, empowering them to protect their data.
b. Providing Control and Redaction Options
Granting users control to redact or delete certain search queries or results associated
with their accounts, offering enhanced privacy control.
62
7. Impact on User Experience and Platform Integrity
Secure search implementations aim to strike a balance between robust security measures and
user experience. While enhancing data security, it ensures that search functionalities remain
seamless, efficient, and user-friendly, contributing to the platform's integrity and
trustworthiness.
The implementation of secure search in social media platforms integrates various
components, from encryption protocols to user education, fostering a secure ecosystem that
prioritizes user privacy, data protection, and a seamless search experience. This multifaceted
approach is pivotal in fortifying social media platforms against evolving threats while
maintaining user trust and platform integrity.
RESULT:
63
Thus the secure search in social media has been performed successfully.
AIM:
To create a simple security and privacy detector.
PROCEDURE:
Threat detection and response is the practice of identifying any malicious activity that
could compromise the network and then composing a proper response to mitigate or
neutralize the threat before it can exploit any present vulnerabilities.
Within the context of an organization's security program, the concept of "threat detection" is
multifaceted. Even the best security programs must plan for worst-case scenarios: when
someone or something has slipped past their defensive and preventative technologies and
becomes a threat.
Detection and response is where people join forces with technology to address a breach. A
strong threat detection and response program combines people, processes, and technology to
recognize signs of a breach as early as possible, and take appropriate actions.
Detecting Threats
When it comes to detecting and mitigating threats, speed is crucial. Security programs must
be able to detect threats quickly and efficiently so attackers don’t have enough time to root
around in sensitive data. A business’s defensive programs can ideally stop a majority of
previously seen threats, meaning they should know how to fight them.
These threats are considered "known" threats. However, there are additional “unknown”
threats that an organization aims to detect. This means the organization hasn't encountered
them before, perhaps because the attacker is using new methods or technologies.
Known threats can sometimes slip past even the best defensive measures, which is why most
security organizations actively look for both known and unknown threats in their
environment. So how can an organization try to detect both known and unknown threats?
Leveraging Threat Intelligence
Threat intelligence is a way of looking at signature data from previously seen attacks and
comparing it to enterprise data to identify threats. This makes it particularly effective at
detecting known threats, but not unknown, threats. Known threats are those that are
64
recognizable because the malware or attacker infrastructure has been identified as associated
with malicious activity.
Unknown threats are those that haven't been identified in the wild (or are ever-changing), but
threat intelligence suggests that threat actors are targeting a swath of vulnerable assets, weak
credentials, or a specific industry vertical. User behavior analytics (UBA) are invaluable in
helping to quickly identify anomalous behavior - possibly indicating an unknown threat -
across your network. UBA tools establish a baseline for what is "normal" in a given
environment, then leverage analytics (or in some cases, machine learning) to determine and
alert when behavior is straying from that baseline.
Attacker behavior analytics (ABA) can expose the various tactics, techniques, and procedures
(TTPs) by which attackers can gain access to your corporate network. TTPs include things
like malware, cryptojacking (using your assets to mine cryptocurrency), and confidential data
exfiltration.
During a breach, every moment an attacker is undetected is time for them to tunnel further
into your environment. A combination of UBAs and ABAs offer a great starting point to
ensure your security operations center (SOC) is alerted to potential threats as early as possible
in the attack chain.
Responding to Security Incidents
One of the most critical aspects to implementing a proper incident response framework is
stakeholder buy-in and alignment, prior to launching the framework. No one likes surprises
or questions-after-the-fact when important work is waiting to be done. Fundamental incident
response questions include:
Do teams know who is responsible at each phase of incident response?
Is the proper chain of communications well understood?
Do team members know when and how to escalate issues as needed?
A great incident response plan and playbook minimizes the impact of a breach and ensures
things run smoothly, even in a stressful breach scenario. If you're just getting started, some
important considerations include:
Defining roles and duties for handling incidents: These responsibilities, including
contact information and backups, should be documented in a readily accessible
channel.
Considering who to loop in: Think beyond IT and security teams to document which
cross-functional or third-party stakeholders – such as legal, PR, your board, or
customers – should be looped in and when. Knowing who owns these various
communications and how they should be executed will help ensure responses run
smoothly and expectations are met along the way.
What Should a Robust Threat Detection Program Employ?
65
Security event threat detection technology to aggregate data from events across the
network, including authentication, network access, and logs from critical systems.
Network threat detection technology to understand traffic patterns on the network
and monitor network traffic, as well as to the internet.
Endpoint threat detection technology to provide detailed information about possibly
malicious events on user machines, as well as any behavioral or forensic information
to aid in investigating threats.
Penetration tests, in addition to other preventative controls, to understand detection
telemetry and coordinate a response.
A Proactive Threat Detection Program
To add a bit more to the element of telemetry and being proactive in threat response, it’s
important to understand there is no single solution. Instead, a combination of tools acts as a
net across the entirety of an organization's attack surface, from end to end, to try and capture
threats before they become serious problems.
Setting Attacker Traps with Honeypots
Some targets are just too tempting for an attacker to pass up. Security teams know this, so
they set traps in hopes that an attacker will take the bait. Within the context of an
organization's network, an intruder trap could include a honeypot target that may seem to
house network services that are especially appealing to an attacker. These “honey credentials”
appear to have user privileges an attacker would need in order to gain access to sensitive
systems or data.
When an attacker goes after this bait, it triggers an alert so the security team knows there is
suspicious activity in the network they should investigate. Learn more about the
different types of deception technology.
Threat Hunting
Instead of waiting for a threat to appear in the organization's network, a threat hunt enables
security analysts to actively go out into their own network, endpoints, and security
technology to look for threats or attackers that may be lurking as-yet undetected. This is an
advanced technique generally performed by veteran security and threat analysts.
By employing a combination of these proactively defensive methods, a security team can
monitor the security of the organization's employees, data, and critical assets. They’ll also
increase their chances of quickly detecting and mitigating a threat.
66
RESULT:
Thus a simple privacy and security detector can be created by following these
procedures.
67