Installationof Hadoop 3
Installationof Hadoop 3
10
Step 1: Installation of openJDK-8
$ Sudo apt install openjdk-8-jdk openjdk-8-jre
$ java -version
$ sudo apt install vim openssh-server openssh-client
Step 2: Adding the Jdk path to the path variable
Open ~/.bashrc and add
$ sudo vim ~/.bashrc
#go to the last line and add the following
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$PATH:$JAVA_HOME
##save and exit
Type
$ echo $JAVA_HOME
$ echo $PATH
(Just in case)
$sudo visudo
# User privilege specification
root ALL=(ALL:ALL) ALL
hadoop ALL=(ALL:ALL) ALL
( to get out , Ctlr+x , Y, enter )
Step 4: Once the user is added, login to the user “Hadoop” to generate the ssh key for
passwordless login ( hadoop@machinename)
$ sudo su - hadoop
$ ssh-keygen -t rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
Check the login to localhost using ssh is valid
$ ssh localhost
IMPORTANT
Once the connection is made, logout from ssh
$ exit
Step 7c: Modify hdfs-site.xml to setup namenode and datanode path and replication factor
Create a folder for namenode and datanode usage
$ ls
Give the permission for the hdfs and htemp folder to hadoop user
$ sudo chown -R hadoop:hadoop /usr/local/hadoop/hdfs
sudo chown -R hadoop:hadoop /usr/local/hadoop/htemp
Modify hdfs-site.xml and add the following lines inside
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:/usr/local/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:/usr/local/hadoop/hdfs/datanode</value>
</property>
</configuration>
Step 7e: Configure the YARN resource manager by editing the yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_
CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_
MAPRED_HOME</value>
</property>
</configuration>
12293 Jps
9877 NameNode
10085 DataNode
10953 NodeManager
10590 ResourceManager
10335 SecondaryNameNode
Step 9: Access the Web portal for hadoop management by typing in the following IP address
in the browser
http://localhost:9870