BDH Record - Merged
BDH Record - Merged
Record Work
Department : MBA
BONAFIDE CERTIFICATE
Examiner-1 Examiner-2
List of Experiments
EX PAGE
DATE NAME OF THE EXERCISE
NO NO
1 20.07.2022 1
Installation of Hadoop with Ubuntu
2 20.07.2022 5
Basic commands to work with Ubuntu
3 02.08.2022 8
Basic HDFS commands
4 10.08.2022 11
HDFS Shell commands – file folder commands
5 24.08.2022 16
HDFS Admin commands
6 15.09.2022 18
Map reduce word count
7 23.09.2022 22
Map reduce max temperature
8 20.10.2022 28
Mongo DB commands
9 25.10.2022 32
Pig Latin commands
10 27.10-2022 40
Map reduce matrix multiplication
Exercise No: 1
INSTALLATION OF HADOOP WITH UBUNTU
Date: 20.07.2022
AIM
To install Hadoop with Ubuntu in the system
PROCEDURE
1
OUTPUT
2
3
RESULT
The installation of Hadoop with Ubuntu has been performed
4
Exercise No: 2
BASIC COMMANDS TO WORK WITH UBUNTU
Date: 20.07.2022
AIM
To work with basic commands in Ubuntu
PROCEDURE
Step 1: Give the following commands to work with some basic functions.
Step 2: pwd - This command refers to the present working directory in which you are
operating
Step 3: dir - This command is used to print all the available directories in the present
working directory
Step 4: ls – This command is used to list down all the directories and files inside the
present working directories.
Step 5: cd - You can change the directories in the terminal
Step 6: touch -This command is used to create a new file
Step 7: mkdir - This command will make a directory in pwd
Step 8: rmdir - This command will remove the directory
Step 9: ping - Use ping commands to check the connectivity to your service
Step 10: hostname - Displays the hostname
Step 11: uname – Use this command to get the release number, version of Linux and
much more
Step 12: hadoop
Step 13: hadoop version
OUTPUT
5
6
RESULT
7
Exercise No: 3
BASIC HDFS COMMANDS
Date: 02.08.2022
AIM
To work with basic HDFS commands
PROCEDURE
Step 1: For formatting name node in hadoop give the following commands
cd hadoop
cd bin
./hadoop namenode -format
Step 2: After navigating into binary folder in Hadoop name node formatting is done
Step 3: To create cluster and start nodes, change to sbin folder and run following
commands.
cd ..
cd sbin
start-dfs.sh
Step 4: To start the resource manager and check the running nodes input the
following command
start-yarn.sh
jps
Step 5: To check the status of the running nodes, run the below command.
jps
Step 6: For creating a directory input the following command
hdfs dfs -mkdir -p /user/
8
OUTPUT
9
RESULT
Basic HDFS commands has been performed and a directory has been created
10
Exercise No: 4
HDFS SHELL COMMANDS – FILE FOLDER COMMANDS
Date: 10.08.2022
AIM
To perform HDFS shell commands
PROCEDURE
Step 1: To copy a file from local FS to HDFS, run the following code
hdfs dfs -copyFromLocal /home/hduser1/ls /user/input
Step 2: To move a file from local FS to HDFS, run the following code
hdfs dfs -moveFromLocal /home/hduser1/This /user/input
Step 3: To copy a file from HDFS FS to local FS,
hdfs dfs -copyToLocal /user/input/This /home/hduser1/Desktop/
Step 4: To display the content of a file,
hdfs dfs -cat /user/input/This
Step 5: To list the directory,
hdfs dfs -ls -R /
Step 6: To count the number of directories (including default root directory) and files,
hdfs dfs -count /
Step 7: To list the disk usage/size for each directory,
hdfs dfs -du -h /
Step 8: To display the last KB of a file,
hdfs dfs -tail /user/input/ls
Step 9: To test if path, file and directory exists,
hdfs dfs -test -e /user/input/ls
hdfs dfs -test -f /user/input/ls
hdfs dfs -test -d /user/input/ls
echo $?
Step 10: To create a new file with 0 bytes,
hdfs dfs -touchz /user/new1
Step 11: To remove all the files and folders,
hdfs dfs -rm -R -skipTrash /user/*
hdfs dfs -rm -R -skipTrash /user/
11
OUTPUT
12
13
14
RESULT
15
Exercise No: 5
HDFS ADMIN COMMANDS
Date: 24.08.2022
AIM
To perform HDFS admin commands
PROCEDURE
OUTPUT
16
RESULT
17
Exercise No: 6
MAP REDUCE WORD COUNT
Date: 15.09.2022
AIM
To perform map reduce word count
PROCEDURE
Step 1: To delete the tmp directory with name and data node from local system,
change to bin folder and run following commands.
cd /home/hadoop/bin
./hadoop namenode –format
Step 2: To create cluster and start nodes, change to sbin folder and run following
commands.
cd /home/hadoop/sbin
./start-dfs.sh
Step 3: To start the resource manager and check the running nodes,
./start-yarn.sh
Step 4: To check the status of the running nodes, run the below command.
jps
Step 5: Class path describes the locations of the available class files to the Java
Compiler.
export HADOOP_CLASSPATH=$(hadoop classpath)
Step 6: To create a directory,
hdfs dfs -mkdir /wordcount1
Step 7: To put a file or folder,
hdfs dfs –put /home/hduser1/wordcount/input /wordcount1/input
Step 8: The program for Mapper and Reducer is written in java and saved in a .java
file.
Step 9: To compile java file,
javac –classpath ${HADOOP_CLASSPATH} -d <folder path> <java file
path>
Step 10: Class files are created in the designated folder.
18
Step 11: To generate jar file,
jar –cvf <jar file name> <folder path where class files saved>
Step 12: The output files are put into a single jar file.
Step 13: To run the jar file,
hadoop jar <jar file path> <class name> <input file path> <output file
path>
Step 14: To see the output,
hdfs dfs –cat <output file path>
OUTPUT
19
20
RESULT
21
Exercise No: 7
MAP REDUCE MAX TEMPERATURE
Date: 23.09.2022
AIM
To perform map reduce max temperature
PROCEDURE
Step 1: To delete the tmp directory with name and data node from local system,
change to bin folder and run following commands.
cd /home/hadoop/bin
./hadoop namenode –format
Step 2: To create cluster and start nodes, change to sbin folder and run following
commands.
cd /home/hadoop/sbin
./start-dfs.sh
Step 3: To start the resource manager and check the running nodes,
./start-yarn.sh
Step 4: To check the status of the running nodes, run the below command.
jps
Step 5: Class path describes the locations of the available class files to the Java
Compiler.
export HADOOP_CLASSPATH=$(hadoop classpath)
Step 6: To create a directory,
hdfs dfs -mkdir /MaxTemp
Step 7: To create an input file,
cat > <input file>
Step 8: To put a file or folder,
hdfs dfs –put /home/hduser1/MaxTemp/input/input.txt /MaxTemp/input
Step 9: The program for Mapper and Reducer is written in java and saved in a .java
file.
Step 10: To compile java file,
javac –classpath ${HADOOP_CLASSPATH} -d <folder path> <java file
22
path>
Step 11: Class files are created in the designated folder.
Step 12: To generate jar file,
jar –cvf <jar file name> <folder path where class files saved>
Step 13: The output files are put into a single jar file.
Step 14: To run the jar file,
hadoop jar <jar file path> <class name> <input file path> <output file
path>
Step 15: To see the output,
hdfs dfs –cat <output file path>
OUTPUT
23
24
25
26
RESULT
27
Exercise No: 8
MONGO DB COMMANDS
Date: 20.10.2022
AIM
To perform Mongo DB commands
PROCEDURE
OUTPUT
28
29
30
RESULT
31
Exercise No: 9
PIG LATIN COMMANDS
Date: 25.10.2022
AIM
To perform pig latin commands
PROCEDURE
Step 1: Start the Hadoop cluster
Step 2: Start the Yarn Resource manager
Step 3: Create a folder called pigdata in the hdfs
Step 4: The pigdata folder will be created in the hdfs view it in the browser
Step 5: Put the files in the hdfs
Step 6: View the files in the browser
Step 7: Display and view one of the file using cat command in the terminal
Step 8: Upload all the txt files using put command into the hdfs and view it in the
browser
Step 9: Open a new terminal and start the pig program using
pig -x local
Step 10: The grunt shell will be opened from where we can execute all the pig latin
commands.
Step 11: Create a new relation called student and load the file from the hdfs system
Step 12: Load student details file from the hdfs file
Step 13: Use dump student command to see the file contents
Step 14: Use describe command to view the data types of the relation
Step 15: Use explain student command to see the schema of the relation
Step 16: Use illustrate command to see the datatypes and one row of the relation
Step 17: Group based on one of the columns
grp_rel=group student_details by city;
dump grp_rel;
Step 18: grp_rel1=group student_details by (city,age);
Step 19: Co Group: When there are two or more relations
cogroup student_details by age employee by age
Step 20: Join:
32
Inner join
Customer_orders=JOIN customers by id, orders by order_id;
Dump customer_orders;
Step 21: Outer join
Outer_left=JOIN customers by id LEFT OUTER,orders by customer_id;
Step 22: dump Outer_left;
Union: Union just combines two relations with all its columns
New_rel=UNION relation1, relation2;
OUTPUT
33
34
35
36
37
38
RESULT
Pig commands has been executed
39
Exercise No: 10
MAP REDUCE MATRIX MULTIPLICATION
Date: 27.10.2022
AIM
To perform map reduce matrix multiplication
PROCEDURE
Step 1: To delete the tmp directory with name and data node from the local system,
change to the bin folder and run the following commands.
cd /home/hadoop/bin
./hadoop namenode –format
Step 2: To create cluster and start nodes, change to sbin folder and run following
commands.
cd /home/hadoop/sbin
./start-dfs.sh
Step 3: To start the resource manager and check the running nodes,
./start-yarn.sh
Step 4: To check the status of the running nodes, run the below command.
jps
Step 5: Class path describes the locations of the available class files to the Java
Compiler.
export HADOOP_CLASSPATH=$(hadoop classpath)
Step 6: To create a directory,
hdfs dfs -mkdir /Matrix
hdfs dfs -mkdir /Matrix/input
Step 7: To put a file or folder,
hdfs dfs –put /home/hduser1/matrix/ufiles/M.txt /Matrix/input
hdfs dfs –put /home/hduser1/matrix/ufiles/N.txt /Matrix/input
Step 8: The program for Mapper and Reducer is written in java and saved in a .java
file.
Step 9: To compile java file,
javac –classpath ${HADOOP_CLASSPATH} -d <folder path> <java file
40
path>
Step 10: Class files are created in the designated folder.
Step 11: To generate jar file,
jar –cvf <jar file name> <folder path where class files saved>
Step 12: The output files are put into a single jar file.
Step 13: To run the jar file,
hadoop jar <jar file path> <class name> <input file path> <output file
path>
Step 14: To see the output,
hdfs dfs –cat <output file path>
OUTPUT
41
42
43
RESULT
Map reduce matrix calculation has been performed
44