Lab Manual OF Big Data Analtyics Lab (Bca04207) : BCA General II Year IV Semester Academic Session 2021-22
Lab Manual OF Big Data Analtyics Lab (Bca04207) : BCA General II Year IV Semester Academic Session 2021-22
LAB MANUAL
OF
BIG DATA ANALTYICS LAB(BCA04207)
TABLE OF CONTENTS
1
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
2
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
LAB RULES
Responsibilities of Users
Users are expected to follow some fairly obvious rules of conduct:
Always:
Never:
If you are having problems or questions, please go to either the faculty, lab in-charge or
the lab supporting staff. They will help you. We need your full support and cooperation
for smooth functioning of the lab.
4
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
INSTRUCTIONS
All the students are supposed to prepare the theory regarding the next experiment.
Students are supposed to bring the practical file and the lab copy.
Previous programs should be written in the practical file.
All the students must follow the instructions, failing which he/she may not be allowed in
the lab.
5
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Part A
Experiment 1:
Prepare infrastructure and understand objective for software requirement for setting up
single node Hadoop cluster.
WinSCP
Putty
Ubuntu
VMPlayer
Hadoop version
Experiment 2:
Create single node Hadoop cluster.
Installing Ubuntu on VM
Installing Java
SSH Configuration
Core-site.xml Configuration
Hdfs-site.xml Configuration
Yarn-site.xml Configuration
Experiment 3:
Testing Single Node cluster, Web UI ports and Exploring different daemons of Hadoop
Cluster.
Experiment 4:
Perform / Execute below sets of Hadoop basic commands:
appendToFile
cat
chgrp
chmod
chown
copyFromLocal
copyToLocal
count
cp
Experiment 5:
Perform / Execute below sets of Hadoop basic commands:
6
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
du
dus
expunge
get
getfacl
getfattr
getmerge
ls
lsr
mkdir
Part B
Experiment 6:
Perform / Execute below sets of Hadoop basic commands:
moveFromLocal
moveToLocal
mv
put
rm
rmr
setfacl
setfattr
setrep
stat
tail
test
text
touchz
Experiment 7:
Install eclipse IDE on single node cluster for executing MapReduce Job and understand the
role of dependent libraries for processing job.
Experiment 8:
Perform a Map Reduce word count job for a given input file by configuring Number of
Reducer 2.
Experiment 9:
Perform a Map Reduce word count job for a given input file by configuring Number of
Reducer 6 and Analyze Experiment 8 and 9.
Experiment 10:
Perform a Map Reduce word count job for a given input file by configuring only Mapper
(No reducer is involved) and Analyze Experiment 8, 9 and 10.
Experiment 11:
Implement one executable Hadoop MapReduce program to perform the inner join of two
tables based on “Student ID” . You can create sample data in below format and can further
7
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment 12:
Implement one executable Hadoop MapReduce program to calculate highest temperature
for every given year. You can consider below sample data for executing this job:
MARKS SCHEME
8
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Total Marks – 10
Performance
Attendance Discipline Record Total
& Viva
1 1 5 3 10
9
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
LAB PLAN
Exp. 1 1 1
Exp. 2 1 2
Exp.3 1 3
Exp.4 2 4
Exp.5 1 5
Exp.6 1 6
Exp.7 1 7
Exp.8 2 8
Exp.9 1 9
Exp.10 1 10
Exp.11 1 11
Exp.12 1 12
Lab Objective
Big Data is high-volume, high velocity, and variety information assets that demand cost-
effective,innovative forms of information processing for enhanced insight and decision
making.
11
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
12
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Hardware :
1.Computer &
2.Peripheral devices
Software:
1. Python,Anaconda,Pycharm,Visual Studio Code, Hadoop,Linux
Text Books:
1. Big Data Analytics by Dasivam R. Thirumahal by Oxford University Press.
Reference Books:
1. Big Data Analytics with Microsoft HD Insight,
Reference Websites:
1. www.tutorialpoint.com
2. www.w3school.com
13
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiments
14
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No.:1
Name of Experiment: Briefly Understand Big Data,Underlying Technologies and Tools
Used in Big Data Analytics.
Output:
Experiment No2
Name of Experiment: Write a program to apply Weka Tool and analyze Apriori
Algorithm.
Output:
Viva question :
16
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No.:3
Name of Experiment: Calculate Mean,Median ,Mode,Standard Deviation, Percentile, Data
Distribution and Normal Data Distribution of an array of integer data.
Viva question :
1. H
2. W
3. W
4. H
17
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No.:2.B
Step2: Find length of str1 and store in some variable say i = length_of_str;.
Step3: Run a loop from 0 till end of str2 and copy each character to str1 from the ith index.
Flowchart:
18
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory: In concatenate (combine) operation of two string, contents of two string are
combined in one string.
In order to merge two strings, first we have to copy all the characters of first string into new
string until NULL character is encountered in the first string. After copying all the characters
of first string into new string we have to copy all the characters of second string into new
string until NULL character is encountered in the second string.
Output:
19
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Viva question :
20
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No.:3.A
Flowchart:
21
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory: In copy a string we find at which position we have to copy a given string.
Output:
22
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Viva Question:
1) What is string?
23
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No.:3.B
Name of Experiment: Use a recursive function for the towers of Hanoi with three
discs.
Sub program:
Step 1: if n== 1 call the sub program Hanoi recursion (num-1, a, c, b)
Step 2: print the output from a to b
Step 3: call the sub program Hanoi recursion(num-1, b, c, a) Step
4: return to main program
Flowchart:
24
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
25
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Here source, intermediate and destination are the three towers. We have to transfer all the
disks from source to destination towers. Here the restriction is not to place a big disk on
smaller one. For this we use intermediate tower. Finally the arrangements in the destination
tower must be as same as the disks in the source tower at first.
Towers of Hanoi problem means we have three towers
26
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Output:
Input No of Disk : 2
Output : Disk 1 moved from A to B
Disk 2 moved from A to C
Disk 1 moved from B to C
Input No of Disk: 3
Output : Disk 1 moved from A to C
Disk 2 moved from A to B
Disk 1 moved from C to B
Disk 3 moved from A to C
Disk 1 moved from B to A
Disk 2 moved from B to C
Disk 1 moved from A to C
Viva Question:
27
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No.:4.A
Step1: Start
Step 2: Set J = N
Step 8: Stop
Flowchart:
28
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
Output:
29
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Viva Question:
30
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No.:4.B
Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to delete an element available at the Kth position of LA
Step 1. Start
Step 2. Set J = K
Step 7. Stop
Flowchart:
31
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
Output:
LA[1] = 3
LA[2] = 5
LA[3] = 7
LA[4] = 8
The array elements after deletion :
LA[0] = 1
LA[1] = 3
LA[2] = 7
LA[3] = 8
Viva Question:
33
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No.:5
Name of Experiment: Write a program to create a linked list and to display it.
Step 1: Start
6.4 Connect this new node at the end of the linked list
34
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Step 8: If choice==3
Step 9: End
Flowchart:
35
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
In this procedure we have to create a single linked list, and display it’s elements. We first
create data structure for linked list which is node of a linked list. A node contain data and
pointer part. One node is connected with another node by pointer part.
Output:
Viva Question:
36
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
37
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No.:6 A
Insertion Sort:
Step 1. start
begin
begin
Step 5. Stop
38
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Flowchart:
39
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
40
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Insertion sort is similar to playing cards. To sort the cards in yourhand you extrat a card shift
the remaining cards and then insert the extracted card in its correct place. The efficiency of
insertion sort is O(n2).
Output:
Viva Question:
41
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No.:6 B
Selection Sort:
1) Start
2) Initiliaze the variables I,j,temp and arr[]
3) Read the loop and check the condition. If the condition is true
print the array elements and increment the I value.
Else goto step 4
4) Read the loop and check the condition. If the condition true then goto next loop.
5) Read the loop and check the condition. If the condition true then goto if condition
6) If the condition if(arr[i]>arr[j]) is true then do the following steps
i) temp=arr[i]
ii) arr[i]=arr[j]
iii) arr[j]=temp
7) increment the j value
8) perform the loop operation for the displaying the sorted elements.
9) print the sorted elements
10) stop
42
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Flowchart:
43
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
This is the simplest method of sorting. In this method, to sort the data in ascending order, the
0th element is compared with all other elements. If the 0th element is found to be greater than
the compared element then they are interchanged.
Output:
Section sort
2 13 17 25 31
section sort
Array before sort
25 31 30 12 1
Viva Question:
44
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
PART-B
Experiment No. :1
Aim: To create a singly linked list and insert a new item in the list.
Step 1: Start
Step 6: Switch(choice)
Step 7: If(choice==1)
7.2 Read the element data from user and store in ptr->data
7.5 Switch(choice)
7.5.1: If choice==1
45
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
7.5.3: If choice==2
7.4.5: If choice==3
Step 8: If choice==2
10.1 Exit()
Flowchart:
46
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
In this program we have to create a single linked list, insert the elements into that list at first
Position, last position, and after a specified data or information.
Output:
22->11->33->NULL
47
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
22->11->44->33->NULL
22->11->44->33->NULL
Viva Question:
48
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No. :2
Aim: Deleting a node from a singly linked list from various position such as first, last
and after a specified position.
Step 1: Start
Step 6: Switch(choice)
Step 7: If(choice==2)
7.2: Switch(choice)
7.2.1: If choice==1
7.2.3: If choice==2
49
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
7.2.5: If choice==3
Step 8: If choice==2
8.1 Exit()
Flowchart:
50
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
In this program we have to create a single linked list, insert the elements into that list at first
Position, last position, and after a specified data or information.
Output:
11->22->33->44->55->66->77->NULL
22->33->44->55->66->77->NULL
22->33->44->55->66->NULL
Viva Question:
52
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
53
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No. :3
Aim: To create a stack using indirect access or pointer and perform the basic
operation.
54
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Flowchart:
55
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
56
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
In this algorithm we have to implement the stack operation by using the pointers. Here they
stack operation are push and pop. Push operation is used to insert the elements into a stack
and pop operation is used to remove the elements in to a stack.
Output:
======================================
MENU
======================================
[1] Using Push Function
[2] Using Pop Function
57
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
======================================
MENU
======================================
[1] Using Push Function
[2] Using Pop Function
[3] Elements present in Stack
[4] Exit
Viva Question:
1) Define Stack ?
58
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Ans: A stack is a linear data structure in which a data item is inserted and deleted at one
end
2) Define data structure ?
Ans: A data structure is a collection of organized data that are related to each other
59
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No. :4
Step 1: Start
Flowchart:
61
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
In this program we have to implement the Queue operation by using the pointers. Here they
Queue operation are push and pop. Push operation is used to insert the elements into a Queue
and pop operation is used to remove the elements in to a Queue.
Output:
[4] Exit
62
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Viva Question:
1) Define queue?
Ans: A queue is a linear, sequential list of that are accessed in the oeder first in first
out(FIFO).
63
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
64
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No. :5
Name of Experiment: Creating a binary search tree and traversing it using in order,
preorder and post order.
Step 1: Start
Step 7: If choice=1
65
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Step 1: Start
Step 2: If t= null
Step 6: Temp->rc=null
66
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Step 11:Return t
Step 1: Start
Step 2: x=d
Step 4: If x->data =t
Strep 5:Break
Step 6: Parent =x
Step 7: if t<x->data
Step 8: t=t->lc
Step 9: t=l->rc
Step11: parent =x
Step12: If parent==null
67
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Step 19:x->data=insert->data
Step 20:x=insert
Step 1: Start
Step 2: If t!=null
Step 6: Stop
Step 1: Start
Step 2: If t!=null
Step 6: Stop
68
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Step 1: Start
Step 2: If t!=null
Step 6: Stop
Flowchart:
69
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
70
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
71
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
The data structure tree is of non linear type. Binary tree is bit special. The proparty of a
binary tree is the value at the root must be grater than the left node and less than the right
node. The binary tree consist of almost two childrens. Tree is a recursive data structure and
recursive programming techniques are popularly used in trees. A tree can can be traversed in
three major ways
i) Inorder traversal: here left child is visited first followed by root and finally by right
child.
ii) Preorder traversal: Here root is visitedfirst follwed by left child and finally by right
child.
iii) Postorder traversal: Here left child is visited first followed by right child finally by
the root.
Output:
Viva Question:
73
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Experiment No. :6
Step1: Start
Step3: Declare the array and their size and initailaze the j=0
Step4: read the array elements and then sort these elements.
Step5: read the array elements before the merge sort and then display the elements.
Step7: display the array elements after merge sort by using the following stament.
for( j=0;j<Max_ary;j++)
Step8: Stop
Subprogram
74
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Mid=(end+start)/2
Step5:merge_sort(x,mid+1,start)
Mrg1=0;
X[j+end]= executing[mrg2++]
Step12: x[j+end]=executing[mrg1++]
75
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Flowchart:
76
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
77
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
78
Lab Manual – BCA04207 –Big Data Analytics Lab Poornima University, Academic Year 2021-22
Theory:
The merge sort splits the list to be sorted into two equal halves, and places them in separate
arrays. Each array is recursively sorted, and then merged back together to form the final
sorted list. Like most recursive sorts, the merge sort has an algorithmic complexity of O(n
log n).
Output:
Section sort
Viva Question:
Ans: The merge sort splits the list to be sorted into two equal halves, and
places them in separate arrays. Each array is recursively sorted, and then
merged back together to form the final sorted list.
79