0% found this document useful (0 votes)

26 views47 pages

BDH Record - Merged

The document describes performing a MapReduce word count job. It involves formatting HDFS, starting HDFS and YARN, exporting classpath, copying input file to HDFS, compiling the Mapper and Reducer java files, and running the job to output word counts.

Uploaded by

bennettsabastin186

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views47 pages

BDH Record - Merged

Uploaded by

bennettsabastin186

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

MBL21304L –Big Data with Hadoop

Record Work

Register Number : RA2152007010018

Name of the Student : Raghuram Krishna B S

Semester/Year : III Semester/II Year

Department : MBA

Speclization : Business Analytics

SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
S.R.M. NAGAR, KATTANKULATHUR -603 203

BONAFIDE CERTIFICATE

Register No. RA2152007010018

Certified to be the bonafide record of work done by Raghuram Krishna B S of second

year, MBA Degree course in the Practical MBL21304L –Big Data with Hadoop in
SRM Institute of Science and Technology, Kattankulathur during the academic year 2022-
2023(Odd Sem).

Signature of the Faculty Signature of the Dean/CoM

Submitted for University Examination held on SRM Institute of Science

and Technology, Kattankulathur.

Examiner-1 Examiner-2
List of Experiments
EX PAGE
DATE NAME OF THE EXERCISE
NO NO

1 20.07.2022 1
Installation of Hadoop with Ubuntu

2 20.07.2022 5
Basic commands to work with Ubuntu

3 02.08.2022 8
Basic HDFS commands

4 10.08.2022 11
HDFS Shell commands – file folder commands

5 24.08.2022 16
HDFS Admin commands

6 15.09.2022 18
Map reduce word count

7 23.09.2022 22
Map reduce max temperature

8 20.10.2022 28
Mongo DB commands

9 25.10.2022 32
Pig Latin commands

10 27.10-2022 40
Map reduce matrix multiplication
Exercise No: 1
INSTALLATION OF HADOOP WITH UBUNTU
Date: 20.07.2022

AIM
To install Hadoop with Ubuntu in the system

PROCEDURE

Step 1: Installation of Oracle VM Virtual box 6.1.26 setup

Step 2: Start installation procedure, open terminal in Linux. To check installed
version
Step 3: Create Hadoop group and new year
Step 4: SSH Server to install, run
Step 5: SSH key generation. Create SSH key and add it to authorized keys
Step 6: Hadoop installation in hduser1. Download and unpack Apache Hadoop
Step 7: Hadoop configuration
Step 8: Format file system
Step 9: Start Hadoop
Step 10: Check if everything is running.

1
OUTPUT

2
3
RESULT
The installation of Hadoop with Ubuntu has been performed

4
Exercise No: 2
BASIC COMMANDS TO WORK WITH UBUNTU
Date: 20.07.2022

AIM
To work with basic commands in Ubuntu

PROCEDURE

Step 1: Give the following commands to work with some basic functions.
Step 2: pwd - This command refers to the present working directory in which you are
operating
Step 3: dir - This command is used to print all the available directories in the present
working directory
Step 4: ls – This command is used to list down all the directories and files inside the
present working directories.
Step 5: cd - You can change the directories in the terminal
Step 6: touch -This command is used to create a new file
Step 7: mkdir - This command will make a directory in pwd
Step 8: rmdir - This command will remove the directory
Step 9: ping - Use ping commands to check the connectivity to your service
Step 10: hostname - Displays the hostname
Step 11: uname – Use this command to get the release number, version of Linux and
much more
Step 12: hadoop
Step 13: hadoop version

OUTPUT

5
6
RESULT

Basic command in Ubuntu has been performed

7
Exercise No: 3
BASIC HDFS COMMANDS
Date: 02.08.2022

AIM
To work with basic HDFS commands

PROCEDURE

Step 1: For formatting name node in hadoop give the following commands
cd hadoop
cd bin
./hadoop namenode -format
Step 2: After navigating into binary folder in Hadoop name node formatting is done
Step 3: To create cluster and start nodes, change to sbin folder and run following
commands.
cd ..
cd sbin
start-dfs.sh
Step 4: To start the resource manager and check the running nodes input the
following command
start-yarn.sh
jps
Step 5: To check the status of the running nodes, run the below command.
jps
Step 6: For creating a directory input the following command
hdfs dfs -mkdir -p /user/

8
OUTPUT

9
RESULT

Basic HDFS commands has been performed and a directory has been created

10
Exercise No: 4
HDFS SHELL COMMANDS – FILE FOLDER COMMANDS
Date: 10.08.2022

AIM
To perform HDFS shell commands

PROCEDURE
Step 1: To copy a file from local FS to HDFS, run the following code
hdfs dfs -copyFromLocal /home/hduser1/ls /user/input
Step 2: To move a file from local FS to HDFS, run the following code
hdfs dfs -moveFromLocal /home/hduser1/This /user/input
Step 3: To copy a file from HDFS FS to local FS,
hdfs dfs -copyToLocal /user/input/This /home/hduser1/Desktop/
Step 4: To display the content of a file,
hdfs dfs -cat /user/input/This
Step 5: To list the directory,
hdfs dfs -ls -R /
Step 6: To count the number of directories (including default root directory) and files,
hdfs dfs -count /
Step 7: To list the disk usage/size for each directory,
hdfs dfs -du -h /
Step 8: To display the last KB of a file,
hdfs dfs -tail /user/input/ls
Step 9: To test if path, file and directory exists,
hdfs dfs -test -e /user/input/ls
hdfs dfs -test -f /user/input/ls
hdfs dfs -test -d /user/input/ls
echo $?
Step 10: To create a new file with 0 bytes,
hdfs dfs -touchz /user/new1
Step 11: To remove all the files and folders,
hdfs dfs -rm -R -skipTrash /user/*
hdfs dfs -rm -R -skipTrash /user/

11
OUTPUT

12
13
14
RESULT

File folder HDFS commands has been performed and verified

15
Exercise No: 5
HDFS ADMIN COMMANDS
Date: 24.08.2022

AIM
To perform HDFS admin commands

PROCEDURE

Step 1: Disk usage- disk usage by HDFS displayed in bytes

hdfs dfs -du -h /
Step 2: checksum- calculates the chunk size of a file, run the following code
hdfs dfs -checksum /user/file1.txt
Step 3: chown- changing the file from one group to other
hdfs dfs -chown /hduser1/usr /uder/file1.txt
Step 4: appendtofile- merging contents of two files into one file
hdfs dfs -appendTofile /home/hduser1/desktop/This /user/lyric2.txt
Step 5: expunge- deleting trash in hdfs
hdfs dfs -expunge
Step 6: chgrp- changing the file from one user group to another group
hdfs dfs -chgrp /user/hduser1/lyric2 /user/usr2
Step 7: chmod- changing user permission for a file
hdfs dfs -chmod -r /lyric

OUTPUT

16
RESULT

HDFS admin commands has been performed

17
Exercise No: 6
MAP REDUCE WORD COUNT
Date: 15.09.2022

AIM
To perform map reduce word count

PROCEDURE

Step 1: To delete the tmp directory with name and data node from local system,
change to bin folder and run following commands.
cd /home/hadoop/bin
./hadoop namenode –format
Step 2: To create cluster and start nodes, change to sbin folder and run following
commands.
cd /home/hadoop/sbin
./start-dfs.sh
Step 3: To start the resource manager and check the running nodes,
./start-yarn.sh
Step 4: To check the status of the running nodes, run the below command.
jps
Step 5: Class path describes the locations of the available class files to the Java
Compiler.
export HADOOP_CLASSPATH=$(hadoop classpath)
Step 6: To create a directory,
hdfs dfs -mkdir /wordcount1
Step 7: To put a file or folder,
hdfs dfs –put /home/hduser1/wordcount/input /wordcount1/input
Step 8: The program for Mapper and Reducer is written in java and saved in a .java
file.
Step 9: To compile java file,
javac –classpath ${HADOOP_CLASSPATH} -d <folder path> <java file
path>
Step 10: Class files are created in the designated folder.

18
Step 11: To generate jar file,
jar –cvf <jar file name> <folder path where class files saved>
Step 12: The output files are put into a single jar file.
Step 13: To run the jar file,
hadoop jar <jar file path> <class name> <input file path> <output file
path>
Step 14: To see the output,
hdfs dfs –cat <output file path>

OUTPUT

19
20
RESULT

Map reduce count has been performed

21
Exercise No: 7
MAP REDUCE MAX TEMPERATURE
Date: 23.09.2022

AIM
To perform map reduce max temperature

PROCEDURE

Step 1: To delete the tmp directory with name and data node from local system,
change to bin folder and run following commands.
cd /home/hadoop/bin
./hadoop namenode –format
Step 2: To create cluster and start nodes, change to sbin folder and run following
commands.
cd /home/hadoop/sbin
./start-dfs.sh
Step 3: To start the resource manager and check the running nodes,
./start-yarn.sh
Step 4: To check the status of the running nodes, run the below command.
jps
Step 5: Class path describes the locations of the available class files to the Java
Compiler.
export HADOOP_CLASSPATH=$(hadoop classpath)
Step 6: To create a directory,
hdfs dfs -mkdir /MaxTemp
Step 7: To create an input file,
cat > <input file>
Step 8: To put a file or folder,
hdfs dfs –put /home/hduser1/MaxTemp/input/input.txt /MaxTemp/input
Step 9: The program for Mapper and Reducer is written in java and saved in a .java
file.
Step 10: To compile java file,
javac –classpath ${HADOOP_CLASSPATH} -d <folder path> <java file

22
path>
Step 11: Class files are created in the designated folder.
Step 12: To generate jar file,
jar –cvf <jar file name> <folder path where class files saved>
Step 13: The output files are put into a single jar file.
Step 14: To run the jar file,
hadoop jar <jar file path> <class name> <input file path> <output file
path>
Step 15: To see the output,
hdfs dfs –cat <output file path>

OUTPUT

23
24
25
26
RESULT

Map reduce max temperature has been performed

27
Exercise No: 8
MONGO DB COMMANDS
Date: 20.10.2022

AIM
To perform Mongo DB commands

PROCEDURE

Step 1: Open Mongodb in command prompt

Step 2: Make a database named mycustomers
Step 3: Create a collection named customers
INSERT: Use the insert command to insert data in to the collection customers.
Step 4: Inserting multiple values into the collection.
Step 5: UPDATE: Command is used to update a value in the collection
Step 6: REMOVE: Removes the value from the collection table

OUTPUT

28
29
30
RESULT

Mongo DB commands has been performed

31
Exercise No: 9
PIG LATIN COMMANDS
Date: 25.10.2022

AIM
To perform pig latin commands

PROCEDURE
Step 1: Start the Hadoop cluster
Step 2: Start the Yarn Resource manager
Step 3: Create a folder called pigdata in the hdfs
Step 4: The pigdata folder will be created in the hdfs view it in the browser
Step 5: Put the files in the hdfs
Step 6: View the files in the browser
Step 7: Display and view one of the file using cat command in the terminal
Step 8: Upload all the txt files using put command into the hdfs and view it in the
browser
Step 9: Open a new terminal and start the pig program using
pig -x local
Step 10: The grunt shell will be opened from where we can execute all the pig latin
commands.
Step 11: Create a new relation called student and load the file from the hdfs system
Step 12: Load student details file from the hdfs file
Step 13: Use dump student command to see the file contents
Step 14: Use describe command to view the data types of the relation
Step 15: Use explain student command to see the schema of the relation
Step 16: Use illustrate command to see the datatypes and one row of the relation
Step 17: Group based on one of the columns
grp_rel=group student_details by city;
dump grp_rel;
Step 18: grp_rel1=group student_details by (city,age);
Step 19: Co Group: When there are two or more relations
cogroup student_details by age employee by age
Step 20: Join:

32
Inner join
Customer_orders=JOIN customers by id, orders by order_id;
Dump customer_orders;
Step 21: Outer join
Outer_left=JOIN customers by id LEFT OUTER,orders by customer_id;
Step 22: dump Outer_left;
Union: Union just combines two relations with all its columns
New_rel=UNION relation1, relation2;

OUTPUT

33
34
35
36
37
38
RESULT
Pig commands has been executed

39
Exercise No: 10
MAP REDUCE MATRIX MULTIPLICATION
Date: 27.10.2022

AIM
To perform map reduce matrix multiplication

PROCEDURE

Step 1: To delete the tmp directory with name and data node from the local system,
change to the bin folder and run the following commands.
cd /home/hadoop/bin
./hadoop namenode –format
Step 2: To create cluster and start nodes, change to sbin folder and run following
commands.
cd /home/hadoop/sbin
./start-dfs.sh
Step 3: To start the resource manager and check the running nodes,
./start-yarn.sh
Step 4: To check the status of the running nodes, run the below command.
jps
Step 5: Class path describes the locations of the available class files to the Java
Compiler.
export HADOOP_CLASSPATH=$(hadoop classpath)
Step 6: To create a directory,
hdfs dfs -mkdir /Matrix
hdfs dfs -mkdir /Matrix/input
Step 7: To put a file or folder,
hdfs dfs –put /home/hduser1/matrix/ufiles/M.txt /Matrix/input
hdfs dfs –put /home/hduser1/matrix/ufiles/N.txt /Matrix/input
Step 8: The program for Mapper and Reducer is written in java and saved in a .java
file.
Step 9: To compile java file,
javac –classpath ${HADOOP_CLASSPATH} -d <folder path> <java file

40
path>
Step 10: Class files are created in the designated folder.
Step 11: To generate jar file,
jar –cvf <jar file name> <folder path where class files saved>
Step 12: The output files are put into a single jar file.
Step 13: To run the jar file,
hadoop jar <jar file path> <class name> <input file path> <output file
path>
Step 14: To see the output,
hdfs dfs –cat <output file path>

OUTPUT

41
42
43
RESULT
Map reduce matrix calculation has been performed

SBT Reference
No ratings yet
SBT Reference
479 pages
Organization Chart Configuration - Version R12.2.x: Author: Creation Date: September 04, 2014 Last Update: 1.0
No ratings yet
Organization Chart Configuration - Version R12.2.x: Author: Creation Date: September 04, 2014 Last Update: 1.0
48 pages
DFS An OverView
No ratings yet
DFS An OverView
50 pages
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
Docker LabGuide
100% (4)
Docker LabGuide
51 pages
XperiDo Server Backup Procedure
No ratings yet
XperiDo Server Backup Procedure
7 pages
IntegrationsBuilderGuide Integrity PTCAPIs PDF
No ratings yet
IntegrationsBuilderGuide Integrity PTCAPIs PDF
181 pages
Lab 1 - Hadoop HDFS and MapReduce
No ratings yet
Lab 1 - Hadoop HDFS and MapReduce
4 pages
Advance Java BscIT Sem 5 2015 Paper Solution Mumbai University
0% (1)
Advance Java BscIT Sem 5 2015 Paper Solution Mumbai University
15 pages
IBM ToolBox For Java JTOpen
No ratings yet
IBM ToolBox For Java JTOpen
772 pages
Java
No ratings yet
Java
103 pages
XPP A Doc
No ratings yet
XPP A Doc
142 pages
Oracle Forms Look & Feel Project: Developer Guide
No ratings yet
Oracle Forms Look & Feel Project: Developer Guide
45 pages
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
No ratings yet
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
210 pages
Jasper Server Install Guide
No ratings yet
Jasper Server Install Guide
112 pages
Apex Listener Starting Issue
100% (1)
Apex Listener Starting Issue
7 pages
Salesforce Migration Guide
No ratings yet
Salesforce Migration Guide
44 pages
OMMP
No ratings yet
OMMP
37 pages
TalendOpenStudio DI IG Windows 6.5.1 EN
No ratings yet
TalendOpenStudio DI IG Windows 6.5.1 EN
19 pages
Packages: Module-3 Annotations & Java Beans 16MCA41
No ratings yet
Packages: Module-3 Annotations & Java Beans 16MCA41
20 pages
Extract Essbase Outline To SQL Database
No ratings yet
Extract Essbase Outline To SQL Database
21 pages
Hadoop File Complte
No ratings yet
Hadoop File Complte
18 pages
7716 MaximoCognosMetadata v1
No ratings yet
7716 MaximoCognosMetadata v1
16 pages
PDC All Labs
100% (1)
PDC All Labs
129 pages
NRO API Documentation
No ratings yet
NRO API Documentation
8 pages
SikuliX-1 1 1-SetupLog
No ratings yet
SikuliX-1 1 1-SetupLog
5 pages
Hands On-Exercies
No ratings yet
Hands On-Exercies
17 pages
Crash 2022 12 14 - 21.10.42.6196 Quilt - Loader
No ratings yet
Crash 2022 12 14 - 21.10.42.6196 Quilt - Loader
15 pages
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
No ratings yet
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
35 pages
Comtoudp Test Code Comtoudp: Program Synopsis Description
No ratings yet
Comtoudp Test Code Comtoudp: Program Synopsis Description
4 pages
Cloud PDF
No ratings yet
Cloud PDF
47 pages
SCJ Errors
No ratings yet
SCJ Errors
14 pages
Design Pattern - Service Provider Interface (SPI)
No ratings yet
Design Pattern - Service Provider Interface (SPI)
2 pages
TP 1 - HDFS
No ratings yet
TP 1 - HDFS
40 pages
Extracting Real Value From Your Data With Apache Hadoop: Sarah Sproehnle
No ratings yet
Extracting Real Value From Your Data With Apache Hadoop: Sarah Sproehnle
51 pages
CC Hadoop Lab
No ratings yet
CC Hadoop Lab
6 pages
Deepshikha Agrawal Pushp B.Sc. (IT), MBA (IT) Certification-Hadoop, Spark, Scala, Python, Tableau, ML (Assistant Professor JLBS)
No ratings yet
Deepshikha Agrawal Pushp B.Sc. (IT), MBA (IT) Certification-Hadoop, Spark, Scala, Python, Tableau, ML (Assistant Professor JLBS)
74 pages
Hadoop Hdfs Commands
No ratings yet
Hadoop Hdfs Commands
2 pages
WHD Install Err Log
No ratings yet
WHD Install Err Log
2 pages
HDFS
No ratings yet
HDFS
6 pages
Extreme Computing Lab Exercises Session One: 1 Getting Started
No ratings yet
Extreme Computing Lab Exercises Session One: 1 Getting Started
6 pages
Dsa Practical File
No ratings yet
Dsa Practical File
16 pages
Data Science
No ratings yet
Data Science
82 pages
HDFS Commands
No ratings yet
HDFS Commands
15 pages
@bigdatalabfile 09
No ratings yet
@bigdatalabfile 09
35 pages
HDFS Commands
No ratings yet
HDFS Commands
1 page
Hadoop and Mapreduce Cheat Sheet
No ratings yet
Hadoop and Mapreduce Cheat Sheet
1 page
Ccs 334 Bigdata Manual
No ratings yet
Ccs 334 Bigdata Manual
45 pages
Bigdatamanualfinal 231019063224 d211cb48
No ratings yet
Bigdatamanualfinal 231019063224 d211cb48
45 pages
Course: Big Data Analytics Lab Scheme: 2017
No ratings yet
Course: Big Data Analytics Lab Scheme: 2017
25 pages
BDA Lab Manual - Organized
No ratings yet
BDA Lab Manual - Organized
69 pages
BDA-Lab Record
No ratings yet
BDA-Lab Record
43 pages
Devsh 201611 Student Exercisemanual
No ratings yet
Devsh 201611 Student Exercisemanual
101 pages
Amrita CC 3.1
No ratings yet
Amrita CC 3.1
7 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
45 pages
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
DE Week-7
No ratings yet
DE Week-7
4 pages
Practical 1 - 1 - Hadoop Commands
No ratings yet
Practical 1 - 1 - Hadoop Commands
3 pages
BDA Record
No ratings yet
BDA Record
34 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
Bda Record
No ratings yet
Bda Record
83 pages
Best Practices For Designing and Implementing Decision Services, Part 2
No ratings yet
Best Practices For Designing and Implementing Decision Services, Part 2
19 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
Big Data Cheat Sheet
No ratings yet
Big Data Cheat Sheet
12 pages
2335 m4 Demo1 v1 b54 kwf9d75
No ratings yet
2335 m4 Demo1 v1 b54 kwf9d75
8 pages
Building Docker Images - Adoc
No ratings yet
Building Docker Images - Adoc
6 pages
Lab2 BigData-HDFSp
No ratings yet
Lab2 BigData-HDFSp
4 pages
BigData Lab Manual
No ratings yet
BigData Lab Manual
44 pages
CCS334-BDA LAB MANUAL Final
No ratings yet
CCS334-BDA LAB MANUAL Final
46 pages
Big Data Analytics lab-JD
No ratings yet
Big Data Analytics lab-JD
49 pages
Bda Lab Record
No ratings yet
Bda Lab Record
32 pages
Hadoop HDFS Commands
No ratings yet
Hadoop HDFS Commands
1 page
Hadoop 1
No ratings yet
Hadoop 1
15 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
Hadoop
No ratings yet
Hadoop
6 pages
CCS334 Bda Lab Manual
No ratings yet
CCS334 Bda Lab Manual
48 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
49 pages
Ccs334 Bda Lab Manual PRINT
No ratings yet
Ccs334 Bda Lab Manual PRINT
53 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
32 pages
Com - Dual.space - Parallel.apps - Multiaccounts.appscloner Logcat
No ratings yet
Com - Dual.space - Parallel.apps - Multiaccounts.appscloner Logcat
6 pages
Hafs Commands
No ratings yet
Hafs Commands
17 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
42 pages
Ccs334 Bda Lab Ex
No ratings yet
Ccs334 Bda Lab Ex
45 pages
Ai&Ml (Bdamanual)
No ratings yet
Ai&Ml (Bdamanual)
24 pages
Apache Hadoop
No ratings yet
Apache Hadoop
3 pages
Linux Command Line and Shell Scripting Bible
From Everand
Linux Command Line and Shell Scripting Bible
Richard Blum
3/5 (3)
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
Red Hat Enterprise Linux 6 Administration: Real World Skills for Red Hat Administrators
From Everand
Red Hat Enterprise Linux 6 Administration: Real World Skills for Red Hat Administrators
Sander van Vugt
No ratings yet

BDH Record - Merged

Uploaded by

BDH Record - Merged

Uploaded by

MBL21304L –Big Data with Hadoop

Register Number : RA2152007010018

Name of the Student : Raghuram Krishna B S

Semester/Year : III Semester/II Year

Speclization : Business Analytics

Register No. RA2152007010018

Certified to be the bonafide record of work done by Raghuram Krishna B S of second

Signature of the Faculty Signature of the Dean/CoM

Submitted for University Examination held on SRM Institute of Science

and Technology, Kattankulathur.

Step 1: Installation of Oracle VM Virtual box 6.1.26 setup

Basic command in Ubuntu has been performed

File folder HDFS commands has been performed and verified

Step 1: Disk usage- disk usage by HDFS displayed in bytes

HDFS admin commands has been performed

Map reduce count has been performed

Map reduce max temperature has been performed

Step 1: Open Mongodb in command prompt

Mongo DB commands has been performed

You might also like