0% found this document useful (0 votes)

20 views

Hadoop

The document details the steps to install and configure Hadoop on Ubuntu 16.04 LTS. It covers installing Java, creating Hadoop user and group, downloading and extracting Hadoop, configuring environment variables, and verifying the installation.

Uploaded by

Jayashree

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Hadoop

Uploaded by

Jayashree

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 28

Hadoop on 16.

04 LTS

Update
fdp17@fdp17-Veriton-M200-H81:~$ sudo apt-get update

Install JDK

fdp17@fdp17-Veriton-M200-H81:~$ sudo apt-get install default-jdk

Check Version

fdp17@fdp17-Veriton-M200-H81:~$ java -version

openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-0ubuntu1.16.04.2-b11)
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)

k@laptop:~$ sudo addgroup hadoop

Adding group `hadoop' (GID 1002) ...
Done.

Creating Hadoop Group and hduser

fdp17@fdp17-Veriton-M200-H81:~$ sudo adduser --ingroup hadoop hduser

Adding user `hduser' ...
Adding new user `hduser' (1001) with group `hadoop' ...
Creating home directory `/home/hduser' ...
Copying files from `/etc/skel' ...
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Changing the user information for hduser
Enter the new value, or press ENTER for the default
Full Name []:
Room Number []:
Work Phone []:
 Home Phone []:
 Other []:
 Is the information correct? [Y/n]
Add hduser to sudo
sudo usermod -a -G sudo hduser


 Install SSH

 fdp17@fdp17-Veriton-M200-H81:~$ sudoapt-get install ssh
 Reading package lists... Done
 Building dependency tree
Reading state information... Done
The following additional packages will be installed:
ncurses-term openssh-client openssh-server openssh-sftp-server ssh-import-id
Suggested packages:
ssh-askpass libpam-ssh keychain monkeysphere rssh molly-guard
The following NEW packages will be installed:
ncurses-term openssh-server openssh-sftp-server ssh ssh-import-id
The following packages will be upgraded:
openssh-client
1 upgraded, 5 newly installed, 0 to remove and 178 not upgraded.
Need to get 1,230 kB of archives.
After this operation, 5,244 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://in.archive.ubuntu.com/ubuntu xenial-updates/main amd64 openssh-client amd64 1:7.2p2-
4ubuntu2.2 [587 kB]
Get:2 http://in.archive.ubuntu.com/ubuntu xenial-updates/main amd64 openssh-sftp-server amd64 1:7.2p2-
4ubuntu2.2 [38.7 kB]
Get:3 http://in.archive.ubuntu.com/ubuntu xenial-updates/main amd64 openssh-server amd64 1:7.2p2-
4ubuntu2.2 [338 kB]
Get:4 http://in.archive.ubuntu.com/ubuntu xenial-updates/main amd64 ssh all 1:7.2p2-4ubuntu2.2 [7,076 B]
Get:5 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 ncurses-term all 6.0+20160213-1ubuntu1 [249
kB]
Get:6 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 ssh-import-id all 5.5-0ubuntu1 [10.2 kB]
Fetched 1,230 kB in 2s (583 kB/s)
Preconfiguring packages ...
(Reading database ... 188613 files and directories currently installed.)
Preparing to unpack .../openssh-client_1%3a7.2p2-4ubuntu2.2_amd64.deb ...
Unpacking openssh-client (1:7.2p2-4ubuntu2.2) over (1:7.2p2-4ubuntu2.1) ...
Selecting previously unselected package openssh-sftp-server.
Preparing to unpack .../openssh-sftp-server_1%3a7.2p2-4ubuntu2.2_amd64.deb ...
Unpacking openssh-sftp-server (1:7.2p2-4ubuntu2.2) ...
Selecting previously unselected package openssh-server.
Preparing to unpack .../openssh-server_1%3a7.2p2-4ubuntu2.2_amd64.deb ...
Unpacking openssh-server (1:7.2p2-4ubuntu2.2) ...
Selecting previously unselected package ssh.
Preparing to unpack .../ssh_1%3a7.2p2-4ubuntu2.2_all.deb ...
Unpacking ssh (1:7.2p2-4ubuntu2.2) ...
Selecting previously unselected package ncurses-term.
Preparing to unpack .../ncurses-term_6.0+20160213-1ubuntu1_all.deb ...
Unpacking ncurses-term (6.0+20160213-1ubuntu1) ...
Selecting previously unselected package ssh-import-id.
Preparing to unpack .../ssh-import-id_5.5-0ubuntu1_all.deb ...
Unpacking ssh-import-id (5.5-0ubuntu1) ...
Processing triggers for man-db (2.7.5-1) ...
Processing triggers for ufw (0.35-0ubuntu2) ...
Processing triggers for systemd (229-4ubuntu16) ...
Processing triggers for ureadahead (0.100.0-19) ...
ureadahead will be reprofiled on next reboot
Setting up openssh-client (1:7.2p2-4ubuntu2.2) ...
Setting up openssh-sftp-server (1:7.2p2-4ubuntu2.2) ...
Setting up openssh-server (1:7.2p2-4ubuntu2.2) ...
Creating SSH2 RSA key; this may take some time ...
2048 SHA256:ENIl49vMNmyHFQMWhQ+7wfyERkQOA6XUx3TpTVzBkgk root@fdp17-Veriton-M200-
H81 (RSA)
Creating SSH2 DSA key; this may take some time ...
1024 SHA256:m8uM/6fhMPV7Ac0+4ROrlQcR36TA5tbT07/OKd7Sv3o root@fdp17-Veriton-M200-H81
(DSA)
Creating SSH2 ECDSA key; this may take some time ...
256 SHA256:x+7TNccRUWPACHLzqvB8dfQ99i7/QzGY8lkE2G1bDHM root@fdp17-Veriton-M200-H81
(ECDSA)
Creating SSH2 ED25519 key; this may take some time ...
256 SHA256:SYNVzUtPB8yy3U01cxQ7OfKZ6Wi7i5hcEpzdXEx6K5Q root@fdp17-Veriton-M200-H81
(ED25519)
Setting up ssh (1:7.2p2-4ubuntu2.2) ...
Setting up ncurses-term (6.0+20160213-1ubuntu1) ...
Setting up ssh-import-id (5.5-0ubuntu1) ...
Processing triggers for systemd (229-4ubuntu16) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for ufw (0.35-0ubuntu2) ...

CHECK ssh and sshd

fdp17@fdp17-Veriton-M200-H81:~$ which ssh
/usr/bin/ssh
fdp17@fdp17-Veriton-M200-H81:~$ which sshd
/usr/sbin/sshd

Switch user to hduser and generate Key

fdp17@fdp17-Veriton-M200-H81:~$ su hduser
Password:
hduser@fdp17-Veriton-M200-H81:/home/fdp17$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa):
Created directory '/home/hduser/.ssh'.
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:/xOGOuWDb/rGI1l07EQq8b2siNTQcTmpfDyYNLAPeKU hduser@fdp17-Veriton-M200-H81
The key's randomart image is:
+---[RSA 2048]----+
| ... o |
| . += = . |
| . E+ @ * |
| ..oB B = |
| oS+ B . |
| . ..+ * |
| . . X.o . |
| . B O.. |
| .Ooo.. |
+----[SHA256]-----+

KEy Transfer
hduser@fdp17-Veriton-M200-H81:/home/fdp17$ cat /home/hduser/.ssh/id_rsa.pub >>
/home/hduser/.ssh/authorized_keys

The second command adds the newly created key to the list of authorized keys so that Hadoop can
use ssh without prompting for a password.
We can check if ssh works:

hduser@fdp17-Veriton-M200-H81:/home/fdp17$ ssh localhost

The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:x+7TNccRUWPACHLzqvB8dfQ99i7/QzGY8lkE2G1bDHM.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.2 LTS (GNU/Linux 4.8.0-36-generic x86_64)

* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage

180 packages can be updated.

116 updates are security updates.

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by

applicable law.

Downloadd Hadoop 2.7.3

fdp17@fdp17-Veriton-M200-H81:~$ wget
http://redrockdigimark.com/apachemirror/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
Unzip

fdp17@fdp17-Veriton-M200-H81:~$ tar xvzf hadoop-2.7.3.tar.gz

Add hduser to sudo

fdp17@fdp17-Veriton-M200-H81:~$ sudo adduser hduser sudo

Adding user `hduser' to group `sudo' ...
Adding user hduser to group sudo
Done.

Move files to /usr/local/hadoop

hduser@fdp17-Veriton-M200-H81:~/hadoop-2.7.3$ ls
bin include libexec NOTICE.txt sbin
etc lib LICENSE.txt README.txt share
hduser@fdp17-Veriton-M200-H81:~/hadoop-2.7.3$ sudo mv * /usr/local/hadoop/
Grant Priviledges
hduser@fdp17-Veriton-M200-H81:~$ sudo chown -R hduser:hadoop /usr/local/hadoop/

Check Java
hduser@fdp17-Veriton-M200-H81:~$ update-alternatives --config java

There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/java-8-openjdk-
amd64/jre/bin/java
Nothing to configure

Now we can append the following to the end of ~/.bashrc:

hduser@laptop:~$ nano ~/.bashrc
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP VARIABLES END

FOR VERSION HADOOP 3.0

#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_PREFIX=/usr/local/hadoop
export PATH=$PATH:$HADOOP_PREFIX/bin
export PATH=$PATH:$HADOOP_PREFIX/sbin
export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_COMMON_HOME=$HADOOP_PREFIX
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export YARN_HOME=$HADOOP_PREFIX
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_PREFIX/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib"
#HADOOP VARIABLES END

hduser@laptop:~$ source ~/.bashrc

note that the JAVA_HOME should be set as the path just before the '.../bin/':
hduser@ubuntu-VirtualBox:~$ javac -version
javac 1.7.0_75

hduser@ubuntu-VirtualBox:~$ which javac

/usr/bin/javac

hduser@ubuntu-VirtualBox:~$ readlink -f /usr/bin/javac

/usr/lib/jvm/java-7-openjdk-amd64/bin/javac

2. /usr/local/hadoop/etc/hadoop/hadoop-env.sh
We need to set JAVA_HOME by modifying hadoop-env.sh file.

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
Adding the above statement in the hadoop-env.sh file ensures that the value of JAVA_HOME
variable will be available to Hadoop whenever it is started up.

FOR VERSION HADOOP 3.1 and JAVA 10 on UBUNTU 18

Include
export HADOOP_OPTS="--add-modules java.activation"
(after case hadoop opts - 3 lines)

3. /usr/local/hadoop/etc/hadoop/core-site.xml:
The /usr/local/hadoop/etc/hadoop/core-site.xml file contains configuration properties that Hadoop
uses when starting up.
This file can be used to override the default settings that Hadoop starts with.
hduser@laptop:~$ sudo mkdir -p /app/hadoop/tmp
hduser@laptop:~$ sudo chown hduser:hadoop /app/hadoop/tmp
Open the file and enter the following in between the <configuration></configuration> tag:
hduser@laptop:~$ vi /usr/local/hadoop/etc/hadoop/core-site.xml

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>

4. /usr/local/hadoop/etc/hadoop/mapred-site.xml
By default, the /usr/local/hadoop/etc/hadoop/ folder contains
/usr/local/hadoop/etc/hadoop/mapred-site.xml.template
file which has to be renamed/copied with the name mapred-site.xml:

hduser@laptop:~$ cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/mapred-site.xml

The mapred-site.xml file is used to specify which framework is being used for MapReduce.
We need to enter the following content in between the <configuration></configuration> tag:

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>

5. /usr/local/hadoop/etc/hadoop/hdfs-site.xml
The /usr/local/hadoop/etc/hadoop/hdfs-site.xml file needs to be configured for each host in the
cluster that is being used.
It is used to specify the directories which will be used as the namenode and the datanode on that
host.
Before editing this file, we need to create two directories which will contain the namenode and the
datanode for this Hadoop installation.
This can be done using the following commands:

hduser@laptop:~$ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

hduser@laptop:~$ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
hduser@laptop:~$ sudo chown -R hduser:hadoop /usr/local/hadoop_store

Open the file and enter the following content in between the <configuration></configuration> tag:
hduser@laptop:~$ nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>

<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
bogotobogo.com site search:
Format the New Hadoop Filesystem
Now, the Hadoop file system needs to be formatted so that we can start to use it. The format
command should be issued with write permission since it creates current directory
under /usr/local/hadoop_store/hdfs/namenode folder:
hduser@laptop:~$ hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

15/04/18 14:43:03 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = laptop/192.168.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.6.0
STARTUP_MSG: classpath = /usr/local/hadoop/etc/hadoop
...
STARTUP_MSG: java = 1.7.0_65
************************************************************/
15/04/18 14:43:03 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP,
INT]
15/04/18 14:43:03 INFO namenode.NameNode: createNameNode [-format]
15/04/18 14:43:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
Formatting using clusterid: CID-e2f515ac-33da-45bc-8466-5b1100a2bf7f
15/04/18 14:43:09 INFO namenode.FSNamesystem: No KeyProvider found.
15/04/18 14:43:09 INFO namenode.FSNamesystem: fsLock is fair:true
15/04/18 14:43:10 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
15/04/18 14:43:10 INFO blockmanagement.DatanodeManager:
dfs.namenode.datanode.registration.ip-hostname-check=true
15/04/18 14:43:10 INFO blockmanagement.BlockManager:
dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
15/04/18 14:43:10 INFO blockmanagement.BlockManager: The block deletion will start around
2015 Apr 18 14:43:10
15/04/18 14:43:10 INFO util.GSet: Computing capacity for map BlocksMap
15/04/18 14:43:10 INFO util.GSet: VM type = 64-bit
15/04/18 14:43:10 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB
15/04/18 14:43:10 INFO util.GSet: capacity = 2^21 = 2097152 entries
15/04/18 14:43:10 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
15/04/18 14:43:10 INFO blockmanagement.BlockManager: defaultReplication =1
15/04/18 14:43:10 INFO blockmanagement.BlockManager: maxReplication = 512
15/04/18 14:43:10 INFO blockmanagement.BlockManager: minReplication =1
15/04/18 14:43:10 INFO blockmanagement.BlockManager: maxReplicationStreams =2
15/04/18 14:43:10 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
15/04/18 14:43:10 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
15/04/18 14:43:10 INFO blockmanagement.BlockManager: encryptDataTransfer = false
15/04/18 14:43:10 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
15/04/18 14:43:10 INFO namenode.FSNamesystem: fsOwner = hduser (auth:SIMPLE)
15/04/18 14:43:10 INFO namenode.FSNamesystem: supergroup = supergroup
15/04/18 14:43:10 INFO namenode.FSNamesystem: isPermissionEnabled = true
15/04/18 14:43:10 INFO namenode.FSNamesystem: HA Enabled: false
15/04/18 14:43:10 INFO namenode.FSNamesystem: Append Enabled: true
15/04/18 14:43:11 INFO util.GSet: Computing capacity for map INodeMap
15/04/18 14:43:11 INFO util.GSet: VM type = 64-bit
15/04/18 14:43:11 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB
15/04/18 14:43:11 INFO util.GSet: capacity = 2^20 = 1048576 entries
15/04/18 14:43:11 INFO namenode.NameNode: Caching file names occuring more than 10 times
15/04/18 14:43:11 INFO util.GSet: Computing capacity for map cachedBlocks
15/04/18 14:43:11 INFO util.GSet: VM type = 64-bit
15/04/18 14:43:11 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB
15/04/18 14:43:11 INFO util.GSet: capacity = 2^18 = 262144 entries
15/04/18 14:43:11 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct =
0.9990000128746033
15/04/18 14:43:11 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
15/04/18 14:43:11 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
15/04/18 14:43:11 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
15/04/18 14:43:11 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and
retry cache entry expiry time is 600000 millis
15/04/18 14:43:11 INFO util.GSet: Computing capacity for map NameNodeRetryCache
15/04/18 14:43:11 INFO util.GSet: VM type = 64-bit
15/04/18 14:43:11 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
15/04/18 14:43:11 INFO util.GSet: capacity = 2^15 = 32768 entries
15/04/18 14:43:11 INFO namenode.NNConf: ACLs enabled? false
15/04/18 14:43:11 INFO namenode.NNConf: XAttrs enabled? true
15/04/18 14:43:11 INFO namenode.NNConf: Maximum size of an xattr: 16384
15/04/18 14:43:12 INFO namenode.FSImage: Allocated new BlockPoolId: BP-130729900-
192.168.1.1-1429393391595
15/04/18 14:43:12 INFO common.Storage: Storage directory
/usr/local/hadoop_store/hdfs/namenode has been successfully formatted.
15/04/18 14:43:12 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with
txid >= 0
15/04/18 14:43:12 INFO util.ExitUtil: Exiting with status 0
15/04/18 14:43:12 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at laptop/192.168.1.1
************************************************************/
Note that hadoop namenode -format command should be executed once before we start using
Hadoop.
If this command is executed again after Hadoop has been used, it'll destroy all the data on the
Hadoop file system.
bogotobogo.com site search:
Starting Hadoop
Now it's time to start the newly installed single node cluster.
We can use start-all.sh or (start-dfs.sh and start-yarn.sh)
k@laptop:~$ cd /usr/local/hadoop/sbin

k@laptop:/usr/local/hadoop/sbin$ ls
distribute-exclude.sh start-all.cmd stop-balancer.sh
hadoop-daemon.sh start-all.sh stop-dfs.cmd
hadoop-daemons.sh start-balancer.sh stop-dfs.sh
hdfs-config.cmd start-dfs.cmd stop-secure-dns.sh
hdfs-config.sh start-dfs.sh stop-yarn.cmd
httpfs.sh start-secure-dns.sh stop-yarn.sh
kms.sh start-yarn.cmd yarn-daemon.sh
mr-jobhistory-daemon.sh start-yarn.sh yarn-daemons.sh
refresh-namenodes.sh stop-all.cmd
slaves.sh stop-all.sh

k@laptop:/usr/local/hadoop/sbin$ sudo su hduser

hduser@laptop:/usr/local/hadoop/sbin$ start-all.sh
hduser@laptop:~$ start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
15/04/18 16:43:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduser-namenode-
laptop.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduser-datanode-laptop.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduser-
secondarynamenode-laptop.out
15/04/18 16:43:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduser-resourcemanager-
laptop.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduser-nodemanager-
laptop.out
We can check if it's really up and running:
hduser@laptop:/usr/local/hadoop/sbin$ jps
9026 NodeManager
f7348 NameNode
9766 Jps
8887 ResourceManager
7507 DataNode
The output means that we now have a functional instance of Hadoop running on our VPS (Virtual
private server).
Another way to check is using netstat:
hduser@laptop:~$ netstat -plten | grep java
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 1001 1843372 10605/java
tcp 0 0 127.0.0.1:54310 0.0.0.0:* LISTEN 1001 1841277 10447/java
tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 1001 1841130 10895/java
tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 1001 1840196 10447/java
tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 1001 1841320 10605/java
tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 1001 1841646 10605/java
tcp6 0 0 :::8040 :::* LISTEN 1001 1845543 11383/java
tcp6 0 0 :::8042 :::* LISTEN 1001 1845551 11383/java
tcp6 0 0 :::8088 :::* LISTEN 1001 1842110 11252/java
tcp6 0 0 :::49630 :::* LISTEN 1001 1845534 11383/java
tcp6 0 0 :::8030 :::* LISTEN 1001 1842036 11252/java
tcp6 0 0 :::8031 :::* LISTEN 1001 1842005 11252/java
tcp6 0 0 :::8032 :::* LISTEN 1001 1842100 11252/java
tcp6 0 0 :::8033 :::* LISTEN 1001 1842162 11252/java

WEB INTERACE for HADOOP

Version 2.8 : localhost:50070

Version 3.0: /localhost:9870

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/
ClusterSetup.html#Hadoop_Startup

Stopping Hadoop
$ pwd
/usr/local/hadoop/sbin

Running an inbuilt mapreduce example

hduser@fdp17-Veriton-M200-H81:~$ hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar pi 2 5

To see a list of options for each example, add the example name to this command. The following is a list of the

included jobs in the examples JAR file.

 aggregatewordcount: An Aggregate-based map/reduce program that counts the words in the input
files.
 aggregatewordhist: An Aggregate-based map/reduce program that computes the histogram of the
words in the input files.
 bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute the exact digits of pi.

 dbcount: An example job that counts the pageview counts from a database.

 distbbp: A map/reduce program that uses a BBP-type formula to compute the exact bits of pi.

 grep: A map/reduce program that counts the matches to a regex in the input.

 join: A job that effects a join over sorted, equally partitioned data sets.

 multifilewc: A job that counts words from several files.

 pentomino: A map/reduce tile laying program to find solutions to pentomino problems.

 pi: A map/reduce program that estimates pi using a quasi-Monte Carlo method.

 randomtextwriter: A map/reduce program that writes 10 GB of random textual data per node.

 randomwriter: A map/reduce program that writes 10 GB of random data per node.

 secondarysort: An example defining a secondary sort to the reduce.

 sort: A map/reduce program that sorts the data written by the random writer.

 sudoku: A Sudoku solver.

 teragen: Generate data for the terasort.

 terasort: Run the terasort.

 teravalidate: Check the results of the terasort.

 wordcount: A map/reduce program that counts the words in the input files.

 wordmean: A map/reduce program that counts the average length of the words in the input files.

 wordmedian: A map/reduce program that counts the median length of the words in the input files.

 wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of
the words in the input files.

Giving permission to folder to execute java program

sudo chmod -R 777 wordcount/

WordCOunt java Program

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
I
mport org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;

public class WordCount {

public static class Map extends MapReduceBase implements

Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value,
OutputCollector<Text, IntWritable> output, Reporter reporter)
throws IOException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
}

public static class Reduce extends MapReduceBase implements

Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, IntWritable> output, Reporter reporter)
throws IOException {
int sum = 0;
while (values.hasNext()) {
sum += values.next().get();
}
output.collect(key, new IntWritable(sum));
}
}

public static void main(String[] args) throws Exception {

JobConf conf = new JobConf(WordCount.class);
conf.setJobName("wordcount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
}

FINAL
STEP 7: Wordcount: Create a file called WordCount.java.
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
public class WordCount {
public static class Map extends MapReduceBase implements
Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);private Text word = new Text();
public void map(LongWritable key, Text value,
OutputCollector<Text, IntWritable> output, Reporter reporter)
throws IOException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
}
public static class Reduce extends MapReduceBase implements
Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, IntWritable> output, Reporter reporter)
throws IOException {
int sum = 0;
while (values.hasNext()) {
sum += values.next().get();
}
output.collect(key, new IntWritable(sum));
}
}public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(WordCount.class);
conf.setJobName("wordcount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
}
Compile:
1
$ javac WordCount.java -cp $(hadoop classpath)
The hadoop classpath provides the compiler with all the paths it needs to compile
correctly and you should see a resulting WordCount.class appear in the directory.Create Jar File:
jar cf wc.jar WordCount*.class
Create HDFS Directory
•
•
/usr/local/Cellar/hadoop/input - input directory in HDFS
/usr/local/Cellar/hadoop/output - output directory in HDFS
hdfs dfs -mkdir -p /usr/local/Cellar/hadoop/input
hdfs dfs -mkdir -p /usr/local/Cellar/hadoop/output
Create text file locally and move to HDFS directory
DOREENs-MacBook-Air:Cellar doreenrobin$ nano file01.txt
DOREENs-MacBook-Air:Cellar doreenrobin$ pwd
/usr/local/Cellar
DOREENs-MacBook-Air:Cellar doreenrobin$ ls
file01.txt
hadoop
DOREENs-MacBook-Air:Cellar doreenrobin$ hadoop fs -put /usr/local/Cellar/
file01.txt /usr/local/Cellar/hadoop/input
17/04/03 14:32:22 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Running the Mapreduce program
DOREENs-MacBook-Air:2.7.3 doreenrobin$ bin/hadoop jar wc.jar WordCount /
usr/local/Cellar/hadoop/input /usr/local/Cellar/hadoop/output/file03
17/04/03 14:40:24 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
17/04/03 14:40:25 INFO Configuration.deprecation: session.id is deprecated. Instead,
use dfs.metrics.session-id
17/04/03 14:40:25 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
17/04/03 14:40:25 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with
processName=JobTracker, sessionId= - already initialized
17/04/03 14:40:25 WARN mapreduce.JobResourceUploader: Hadoop command-line
option parsing not performed. Implement the Tool interface and execute your
application with ToolRunner to remedy this.
17/04/03 14:40:26 INFO mapred.FileInputFormat: Total input paths to process : 2
17/04/03 14:40:26 INFO mapreduce.JobSubmitter: number of splits:2
17/04/03 14:40:26 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_local428776688_0001
17/04/03 14:40:26 INFO mapreduce.Job: The url to track the job: http://localhost:
8080/17/04/03 14:40:26 INFO mapred.LocalJobRunner: OutputCommitter set in config null
17/04/03 14:40:26 INFO mapreduce.Job: Running job: job_local428776688_0001
17/04/03 14:40:26 INFO mapred.LocalJobRunner: OutputCommitter is
org.apache.hadoop.mapred.FileOutputCommitter
17/04/03 14:40:26 INFO output.FileOutputCommitter: File Output Committer
Algorithm version is 1
17/04/03 14:40:26 INFO mapred.LocalJobRunner: Waiting for map tasks
17/04/03 14:40:26 INFO mapred.LocalJobRunner: Starting task:
attempt_local428776688_0001_m_000000_0
17/04/03 14:40:26 INFO output.FileOutputCommitter: File Output Committer
Algorithm version is 1
17/04/03 14:40:26 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree
currently is supported only on Linux.
17/04/03 14:40:26 INFO mapred.Task: Using ResourceCalculatorProcessTree : null
17/04/03 14:40:26 INFO mapred.MapTask: Processing split: hdfs://localhost:9000/
usr/local/Cellar/hadoop/input/file02.txt:0+29
17/04/03 14:40:27 INFO mapred.MapTask: numReduceTasks: 1
17/04/03 14:40:27 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/03 14:40:27 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/03 14:40:27 INFO mapred.MapTask: soft limit at 83886080
17/04/03 14:40:27 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/03 14:40:27 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/03 14:40:27 INFO mapred.MapTask: Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/03 14:40:27 INFO mapred.LocalJobRunner:
17/04/03 14:40:27 INFO mapred.MapTask: Starting flush of map output
17/04/03 14:40:27 INFO mapred.MapTask: Spilling map output
17/04/03 14:40:27 INFO mapred.MapTask: bufstart = 0; bufend = 44; bufvoid =
104857600
17/04/03 14:40:27 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend =
26214384(104857536); length = 13/6553600
17/04/03 14:40:27 INFO mapred.MapTask: Finished spill 0
17/04/03 14:40:27 INFO mapred.Task:
Task:attempt_local428776688_0001_m_000000_0 is done. And is in the process of
committing
17/04/03 14:40:27 INFO mapred.LocalJobRunner: hdfs://localhost:9000/usr/local/
Cellar/hadoop/input/file02.txt:0+29
17/04/03 14:40:27 INFO mapred.Task: Task
'attempt_local428776688_0001_m_000000_0' done.
17/04/03 14:40:27 INFO mapred.LocalJobRunner: Finishing task:
attempt_local428776688_0001_m_000000_0
17/04/03 14:40:27 INFO mapred.LocalJobRunner: Starting task:
attempt_local428776688_0001_m_000001_0
17/04/03 14:40:27 INFO output.FileOutputCommitter: File Output Committer
Algorithm version is 117/04/03 14:40:27 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree
currently is supported only on Linux.
17/04/03 14:40:27 INFO mapred.Task: Using ResourceCalculatorProcessTree : null
17/04/03 14:40:27 INFO mapred.MapTask: Processing split: hdfs://localhost:9000/
usr/local/Cellar/hadoop/input/file01.txt:0+22
17/04/03 14:40:27 INFO mapred.MapTask: numReduceTasks: 1
17/04/03 14:40:27 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/03 14:40:27 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/03 14:40:27 INFO mapred.MapTask: soft limit at 83886080
17/04/03 14:40:27 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/03 14:40:27 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/03 14:40:27 INFO mapred.MapTask: Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/03 14:40:27 INFO mapred.LocalJobRunner:
17/04/03 14:40:27 INFO mapred.MapTask: Starting flush of map output
17/04/03 14:40:27 INFO mapred.MapTask: Spilling map output
17/04/03 14:40:27 INFO mapred.MapTask: bufstart = 0; bufend = 38; bufvoid =
104857600
17/04/03 14:40:27 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend =
26214384(104857536); length = 13/6553600
17/04/03 14:40:27 INFO mapred.MapTask: Finished spill 0
17/04/03 14:40:27 INFO mapred.Task:
Task:attempt_local428776688_0001_m_000001_0 is done. And is in the process of
committing
17/04/03 14:40:27 INFO mapred.LocalJobRunner: hdfs://localhost:9000/usr/local/
Cellar/hadoop/input/file01.txt:0+22
17/04/03 14:40:27 INFO mapred.Task: Task
'attempt_local428776688_0001_m_000001_0' done.
17/04/03 14:40:27 INFO mapred.LocalJobRunner: Finishing task:
attempt_local428776688_0001_m_000001_0
17/04/03 14:40:27 INFO mapred.LocalJobRunner: map task executor complete.
17/04/03 14:40:27 INFO mapred.LocalJobRunner: Waiting for reduce tasks
17/04/03 14:40:27 INFO mapred.LocalJobRunner: Starting task:
attempt_local428776688_0001_r_000000_0
17/04/03 14:40:27 INFO output.FileOutputCommitter: File Output Committer
Algorithm version is 1
17/04/03 14:40:27 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree
currently is supported only on Linux.
17/04/03 14:40:27 INFO mapred.Task: Using ResourceCalculatorProcessTree : null
17/04/03 14:40:27 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin:
org.apache.hadoop.mapreduce.task.reduce.Shuffle@7e5ccf1d
17/04/03 14:40:27 INFO reduce.MergeManagerImpl: MergerManager:
memoryLimit=334338464, maxSingleShuffleLimit=83584616,
mergeThreshold=220663392, ioSortFactor=10,
memToMemMergeOutputsThreshold=1017/04/03 14:40:27 INFO reduce.EventFetcher:
attempt_local428776688_0001_r_000000_0 Thread started: EventFetcher for
fetching Map Completion Events
17/04/03 14:40:27 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output
of map attempt_local428776688_0001_m_000000_0 decomp: 41 len: 45 to
MEMORY
17/04/03 14:40:27 INFO reduce.InMemoryMapOutput: Read 41 bytes from map-
output for attempt_local428776688_0001_m_000000_0
17/04/03 14:40:27 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-
output of size: 41, inMemoryMapOutputs.size() -> 1, commitMemory -> 0,
usedMemory ->41
17/04/03 14:40:27 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output
of map attempt_local428776688_0001_m_000001_0 decomp: 36 len: 40 to
MEMORY
17/04/03 14:40:27 INFO reduce.InMemoryMapOutput: Read 36 bytes from map-
output for attempt_local428776688_0001_m_000001_0
17/04/03 14:40:27 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-
output of size: 36, inMemoryMapOutputs.size() -> 2, commitMemory -> 41,
usedMemory ->77
17/04/03 14:40:27 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
17/04/03 14:40:27 INFO mapred.LocalJobRunner: 2 / 2 copied.
17/04/03 14:40:27 INFO reduce.MergeManagerImpl: finalMerge called with 2 in-
memory map-outputs and 0 on-disk map-outputs
17/04/03 14:40:27 INFO mapred.Merger: Merging 2 sorted segments
17/04/03 14:40:27 INFO mapred.Merger: Down to the last merge-pass, with 2
segments left of total size: 61 bytes
17/04/03 14:40:27 INFO reduce.MergeManagerImpl: Merged 2 segments, 77 bytes
to disk to satisfy reduce memory limit
17/04/03 14:40:27 INFO reduce.MergeManagerImpl: Merging 1 files, 79 bytes from
disk
17/04/03 14:40:27 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes
from memory into reduce
17/04/03 14:40:27 INFO mapred.Merger: Merging 1 sorted segments
17/04/03 14:40:27 INFO mapred.Merger: Down to the last merge-pass, with 1
segments left of total size: 69 bytes
17/04/03 14:40:27 INFO mapred.LocalJobRunner: 2 / 2 copied.
17/04/03 14:40:27 INFO mapreduce.Job: Job job_local428776688_0001 running in
uber mode : false
17/04/03 14:40:27 INFO mapreduce.Job: map 100% reduce 0%
17/04/03 14:40:27 INFO mapred.Task:
Task:attempt_local428776688_0001_r_000000_0 is done. And is in the process of
committing
17/04/03 14:40:27 INFO mapred.LocalJobRunner: 2 / 2 copied.
17/04/03 14:40:27 INFO mapred.Task: Task
attempt_local428776688_0001_r_000000_0 is allowed to commit now17/04/03 14:40:27 INFO
output.FileOutputCommitter: Saved output of task
'attempt_local428776688_0001_r_000000_0' to hdfs://localhost:9000/usr/local/
Cellar/hadoop/output/file03/_temporary/0/task_local428776688_0001_r_000000
17/04/03 14:40:27 INFO mapred.LocalJobRunner: reduce > reduce
17/04/03 14:40:27 INFO mapred.Task: Task
'attempt_local428776688_0001_r_000000_0' done.
17/04/03 14:40:27 INFO mapred.LocalJobRunner: Finishing task:
attempt_local428776688_0001_r_000000_0
17/04/03 14:40:27 INFO mapred.LocalJobRunner: reduce task executor complete.
17/04/03 14:40:28 INFO mapreduce.Job: map 100% reduce 100%
17/04/03 14:40:28 INFO mapreduce.Job: Job job_local428776688_0001 completed
successfully
17/04/03 14:40:28 INFO mapreduce.Job: Counters: 35

File System Counters

FILE: Number of bytes read=10367

FILE: Number of bytes written=906415

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=131

HDFS: Number of bytes written=41

HDFS: Number of read operations=22

HDFS: Number of large read operations=0

HDFS: Number of write operations=5

Map-Reduce Framework

Map input records=3

Map output records=8

Map output bytes=82

Map output materialized bytes=85

Input split bytes=228

Combine input records=8

Combine output records=6

Reduce input groups=5

Reduce shuffle bytes=85

Reduce input records=6

Reduce output records=5

Spilled Records=12

Shuffled Maps =2

Failed Shuffles=0

Merged Map outputs=2

GC time elapsed (ms)=0

Total committed heap usage (bytes)=957874176

Shuffle Errors

BAD_ID=0

CONNECTION=0
IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0
File Input Format Counters

Bytes Read=51
File Output Format Counters

Bytes Written=41
////////////////////
WebUrl
http://localhost:50070/explorer.html#/usr/local/Cellar/hadoop/output/file03

UBUNTU

hduser@doreen-B250M-D3H:~/wordcount$ sudo nano WordCount.java

[sudo] password for hduser:

hduser@doreen-B250M-D3H:~/wordcount$ javac WordCount.java -cp $(hadoop classpath)

hduser@doreen-B250M-D3H:~/wordcount$ jar cf wc.jar WordCount*.class

FILES TO HADOOP
The following command is used to create an input directory in HDFS.
$HADOOP_HOME/bin/hadoop fs -mkdir input_dir
Step 5
The following command is used to copy the input file named sample.txtin the input directory of HDFS.
$HADOOP_HOME/bin/hadoop fs -put /home/hadoop/sample.txt input_dir

Step 6
The following command is used to verify the files in the input directory.
$HADOOP_HOME/bin/hadoop fs -ls input_dir/

hduser@doreen-B250M-D3H:~/wordcount$ hadoop jar wc.jar WordCount /user/hduser/input

/user/hduser/output1
17/07/08 12:26:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform...
using builtin-java classes where applicable
17/07/08 12:26:16 INFO Configuration.deprecation: session.id is deprecated. Instead, use
dfs.metrics.session-id
17/07/08 12:26:16 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
sessionId=
17/07/08 12:26:16 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker,
sessionId= - already initialized
17/07/08 12:26:16 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not
performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/07/08 12:26:16 INFO mapred.FileInputFormat: Total input paths to process : 1
17/07/08 12:26:16 INFO mapreduce.JobSubmitter: number of splits:1
17/07/08 12:26:16 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local170299371_0001
17/07/08 12:26:16 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
17/07/08 12:26:16 INFO mapred.LocalJobRunner: OutputCommitter set in config null
17/07/08 12:26:16 INFO mapreduce.Job: Running job: job_local170299371_0001
17/07/08 12:26:16 INFO mapred.LocalJobRunner: OutputCommitter is
org.apache.hadoop.mapred.FileOutputCommitter
17/07/08 12:26:16 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/07/08 12:26:16 INFO mapred.LocalJobRunner: Waiting for map tasks
17/07/08 12:26:16 INFO mapred.LocalJobRunner: Starting task:
attempt_local170299371_0001_m_000000_0
17/07/08 12:26:16 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/07/08 12:26:16 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/07/08 12:26:16 INFO mapred.MapTask: Processing split:
hdfs://localhost:54310/user/hduser/input/sample.txt:0+95
17/07/08 12:26:16 INFO mapred.MapTask: numReduceTasks: 1
17/07/08 12:26:16 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/07/08 12:26:16 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/07/08 12:26:16 INFO mapred.MapTask: soft limit at 83886080
17/07/08 12:26:16 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/07/08 12:26:16 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/07/08 12:26:16 INFO mapred.MapTask: Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/07/08 12:26:16 INFO mapred.LocalJobRunner:
17/07/08 12:26:16 INFO mapred.MapTask: Starting flush of map output
17/07/08 12:26:16 INFO mapred.MapTask: Spilling map output
17/07/08 12:26:16 INFO mapred.MapTask: bufstart = 0; bufend = 147; bufvoid = 104857600
17/07/08 12:26:16 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend =
26214348(104857392); length = 49/6553600
17/07/08 12:26:16 INFO mapred.MapTask: Finished spill 0
17/07/08 12:26:16 INFO mapred.Task: Task:attempt_local170299371_0001_m_000000_0 is done. And is in
the process of committing
17/07/08 12:26:16 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hduser/input/sample.txt:0+95
17/07/08 12:26:16 INFO mapred.Task: Task 'attempt_local170299371_0001_m_000000_0' done.
17/07/08 12:26:16 INFO mapred.LocalJobRunner: Finishing task:
attempt_local170299371_0001_m_000000_0
17/07/08 12:26:16 INFO mapred.LocalJobRunner: map task executor complete.
17/07/08 12:26:16 INFO mapred.LocalJobRunner: Waiting for reduce tasks
17/07/08 12:26:16 INFO mapred.LocalJobRunner: Starting task:
attempt_local170299371_0001_r_000000_0
17/07/08 12:26:16 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/07/08 12:26:16 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/07/08 12:26:16 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin:
org.apache.hadoop.mapreduce.task.reduce.Shuffle@4b61154f
17/07/08 12:26:16 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464,
maxSingleShuffleLimit=83584616, mergeThreshold=220663392, ioSortFactor=10,
memToMemMergeOutputsThreshold=10
17/07/08 12:26:16 INFO reduce.EventFetcher: attempt_local170299371_0001_r_000000_0 Thread started:
EventFetcher for fetching Map Completion Events
17/07/08 12:26:16 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map
attempt_local170299371_0001_m_000000_0 decomp: 108 len: 112 to MEMORY
17/07/08 12:26:16 INFO reduce.InMemoryMapOutput: Read 108 bytes from map-output for
attempt_local170299371_0001_m_000000_0
17/07/08 12:26:16 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 108,
inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->108
17/07/08 12:26:16 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
17/07/08 12:26:16 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/07/08 12:26:16 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and
0 on-disk map-outputs
17/07/08 12:26:16 INFO mapred.Merger: Merging 1 sorted segments
17/07/08 12:26:16 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 98
bytes
17/07/08 12:26:16 INFO reduce.MergeManagerImpl: Merged 1 segments, 108 bytes to disk to satisfy reduce
memory limit
17/07/08 12:26:16 INFO reduce.MergeManagerImpl: Merging 1 files, 112 bytes from disk
17/07/08 12:26:16 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
17/07/08 12:26:16 INFO mapred.Merger: Merging 1 sorted segments
17/07/08 12:26:16 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 98
bytes
17/07/08 12:26:16 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/07/08 12:26:16 INFO mapred.Task: Task:attempt_local170299371_0001_r_000000_0 is done. And is in
the process of committing
17/07/08 12:26:16 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/07/08 12:26:16 INFO mapred.Task: Task attempt_local170299371_0001_r_000000_0 is allowed to
commit now
17/07/08 12:26:16 INFO output.FileOutputCommitter: Saved output of task
'attempt_local170299371_0001_r_000000_0' to
hdfs://localhost:54310/user/hduser/output1/_temporary/0/task_local170299371_0001_r_000000
17/07/08 12:26:16 INFO mapred.LocalJobRunner: reduce > reduce
17/07/08 12:26:16 INFO mapred.Task: Task 'attempt_local170299371_0001_r_000000_0' done.
17/07/08 12:26:16 INFO mapred.LocalJobRunner: Finishing task:
attempt_local170299371_0001_r_000000_0
17/07/08 12:26:16 INFO mapred.LocalJobRunner: reduce task executor complete.
17/07/08 12:26:17 INFO mapreduce.Job: Job job_local170299371_0001 running in uber mode : false
17/07/08 12:26:17 INFO mapreduce.Job: map 100% reduce 100%
17/07/08 12:26:17 INFO mapreduce.Job: Job job_local170299371_0001 completed successfully
17/07/08 12:26:17 INFO mapreduce.Job: Counters: 35
File System Counters
FILE: Number of bytes read=6412
FILE: Number of bytes written=568778
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=190
HDFS: Number of bytes written=74
HDFS: Number of read operations=13
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Map-Reduce Framework
Map input records=1
Map output records=13
Map output bytes=147
Map output materialized bytes=112
Input split bytes=103
Combine input records=13
Combine output records=8
Reduce input groups=8
Reduce shuffle bytes=112
Reduce input records=8
Reduce output records=8
Spilled Records=16
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=0
Total committed heap usage (bytes)=708837376
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=95
File Output Format Counters
Bytes Written=74

Orange Game in Negotiation
No ratings yet
Orange Game in Negotiation
2 pages
Red Hat Security: Linux in Physical, Virtual and Cloud (RH415)
No ratings yet
Red Hat Security: Linux in Physical, Virtual and Cloud (RH415)
12 pages
Updated CMD
No ratings yet
Updated CMD
23 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
27 pages
Hadoop Installation Final
No ratings yet
Hadoop Installation Final
32 pages
Ambari Installation CentOS7
No ratings yet
Ambari Installation CentOS7
3 pages
Login To Remote Server Via SSH: Here Is What I've Done
No ratings yet
Login To Remote Server Via SSH: Here Is What I've Done
24 pages
STB After Install
No ratings yet
STB After Install
4 pages
Complete Hadoop Map Reduce Hive Setup Step by Step
No ratings yet
Complete Hadoop Map Reduce Hive Setup Step by Step
30 pages
Building Your Own Call of Duty 2 Linux Dedicated Server
No ratings yet
Building Your Own Call of Duty 2 Linux Dedicated Server
4 pages
Daily Notes DevOps
No ratings yet
Daily Notes DevOps
53 pages
GCC Lab Manual
No ratings yet
GCC Lab Manual
61 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
How To Install and Setup Freeradius Server in Linux (RHEL - CentOS 7 - 8) Using 6 Easy Steps - CyberITHub
No ratings yet
How To Install and Setup Freeradius Server in Linux (RHEL - CentOS 7 - 8) Using 6 Easy Steps - CyberITHub
16 pages
Building Your Own Call of Duty 2 Linux Dedicated Server PDF
No ratings yet
Building Your Own Call of Duty 2 Linux Dedicated Server PDF
4 pages
Debian10 Final
No ratings yet
Debian10 Final
22 pages
Hadoop Single Node Installation
No ratings yet
Hadoop Single Node Installation
7 pages
Hadoop 2x Installation With HA (NFS - QJM)
No ratings yet
Hadoop 2x Installation With HA (NFS - QJM)
43 pages
Install Hadoop in RHEL 8 PDF
No ratings yet
Install Hadoop in RHEL 8 PDF
9 pages
Install Talos Linux
No ratings yet
Install Talos Linux
5 pages
Readme Cadence
No ratings yet
Readme Cadence
12 pages
Ex200 2
No ratings yet
Ex200 2
29 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
PRACTICAL 4 - Single and Multi Node Hadoop Install
No ratings yet
PRACTICAL 4 - Single and Multi Node Hadoop Install
11 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster) STEP:1
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster) STEP:1
13 pages
Linux Commandline Magic (JAB15)
No ratings yet
Linux Commandline Magic (JAB15)
79 pages
CSE Assignment
No ratings yet
CSE Assignment
22 pages
Linux - Privilege Escalation - Internal All the Things
No ratings yet
Linux - Privilege Escalation - Internal All the Things
2 pages
GIT Instalacion on Ubuntu 20181029
No ratings yet
GIT Instalacion on Ubuntu 20181029
3 pages
Install CouchDB Server in Ubuntu 22.04
No ratings yet
Install CouchDB Server in Ubuntu 22.04
13 pages
Commands in Apache
No ratings yet
Commands in Apache
3 pages
Debian Lenny HowTo Cluster
No ratings yet
Debian Lenny HowTo Cluster
8 pages
Lsa Sle Proj
No ratings yet
Lsa Sle Proj
41 pages
Squid ZPH WebHTB
No ratings yet
Squid ZPH WebHTB
8 pages
ACTION Lab HPC Install Mannul
No ratings yet
ACTION Lab HPC Install Mannul
10 pages
Big Data Analytics - Lab-Manual
No ratings yet
Big Data Analytics - Lab-Manual
19 pages
Running Ha Do Op Michel Noll
No ratings yet
Running Ha Do Op Michel Noll
23 pages
Rhce Exam-Certcollection
No ratings yet
Rhce Exam-Certcollection
14 pages
Ubuntu Command
No ratings yet
Ubuntu Command
2 pages
Installing Multi Node Cluster - Handbook 2.0
No ratings yet
Installing Multi Node Cluster - Handbook 2.0
2 pages
Docker
No ratings yet
Docker
15 pages
INSTALLATION Complete Asterisk-OpenImsCore
No ratings yet
INSTALLATION Complete Asterisk-OpenImsCore
5 pages
Deployment Commands For Laravel and VueJs Apps On Digitalocean
No ratings yet
Deployment Commands For Laravel and VueJs Apps On Digitalocean
10 pages
Hadoop Installation Final
No ratings yet
Hadoop Installation Final
5 pages
Hadoop Cluster
No ratings yet
Hadoop Cluster
18 pages
How To Install Linux, Apache, Mysql, PHP (Lamp) Stack On Debian 9 Stretch
No ratings yet
How To Install Linux, Apache, Mysql, PHP (Lamp) Stack On Debian 9 Stretch
24 pages
Course: Big Data Analytics Lab Scheme: 2017
No ratings yet
Course: Big Data Analytics Lab Scheme: 2017
25 pages
Lab 0-Cluster With Multiple VMs-30-01-2024
No ratings yet
Lab 0-Cluster With Multiple VMs-30-01-2024
6 pages
Installing Asterisk With Yum
No ratings yet
Installing Asterisk With Yum
3 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
Installar Notes
No ratings yet
Installar Notes
42 pages
Deep Learning Server Platform_Admin Manual 2.0
No ratings yet
Deep Learning Server Platform_Admin Manual 2.0
20 pages
Java-Hadoop 2.X Setting Up
No ratings yet
Java-Hadoop 2.X Setting Up
12 pages
Install Apache On Ubuntu 20
No ratings yet
Install Apache On Ubuntu 20
11 pages
Jellyfin Raspberry Pi
No ratings yet
Jellyfin Raspberry Pi
3 pages
Hadoop Multi Node Cluster
No ratings yet
Hadoop Multi Node Cluster
7 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Rhcsa Exam
100% (1)
Rhcsa Exam
14 pages
BIG DATA LAB FILES
No ratings yet
BIG DATA LAB FILES
7 pages
Backend Handbook: for Ruby on Rails Apps
From Everand
Backend Handbook: for Ruby on Rails Apps
Francisco Quintero
1/5 (1)
M09 Wolf57139 03 Se C09
No ratings yet
M09 Wolf57139 03 Se C09
40 pages
Bitcoin Poce - Căutare Google
No ratings yet
Bitcoin Poce - Căutare Google
1 page
Differntial Equation
No ratings yet
Differntial Equation
42 pages
Ged Science Xmind
No ratings yet
Ged Science Xmind
8 pages
Flow Through Annulus
No ratings yet
Flow Through Annulus
10 pages
Cover Letter For Bilingual Receptionist
100% (1)
Cover Letter For Bilingual Receptionist
6 pages
Talkactive - Id Company Profile
No ratings yet
Talkactive - Id Company Profile
16 pages
Oracle SQL: by Veeru Boppudi
No ratings yet
Oracle SQL: by Veeru Boppudi
200 pages
(Ebook) Laravel: Up & Running by Matt Stauffer ISBN 9781492041160, 1492041165 2024 Scribd Download
100% (8)
(Ebook) Laravel: Up & Running by Matt Stauffer ISBN 9781492041160, 1492041165 2024 Scribd Download
67 pages
Your Electricity Bill For: Details Overleaf
No ratings yet
Your Electricity Bill For: Details Overleaf
2 pages
Maintenance Manual RFLD Series Duplex Lube Oil Filters Carbon Steel & Stainless Steel
No ratings yet
Maintenance Manual RFLD Series Duplex Lube Oil Filters Carbon Steel & Stainless Steel
22 pages
Design of Experiment: Eng. Ibrahim Kuhail
No ratings yet
Design of Experiment: Eng. Ibrahim Kuhail
23 pages
Dynamic Trade Off Theory
No ratings yet
Dynamic Trade Off Theory
9 pages
IndusPowerElecs Act3 Batilo - Guian
No ratings yet
IndusPowerElecs Act3 Batilo - Guian
2 pages
Metal Forming Practise: Springer
100% (1)
Metal Forming Practise: Springer
7 pages
AL-905-R-11202 Spec AG01 Rev. T02
No ratings yet
AL-905-R-11202 Spec AG01 Rev. T02
3 pages
The Elements of Journalism - Book Review
No ratings yet
The Elements of Journalism - Book Review
4 pages
OBE Approach in Nursing
100% (1)
OBE Approach in Nursing
6 pages
Axelos Skills Framework Light
No ratings yet
Axelos Skills Framework Light
9 pages
Digital Payment System
No ratings yet
Digital Payment System
15 pages
Week 1 - Session 1 Relations Functions Domain Range
No ratings yet
Week 1 - Session 1 Relations Functions Domain Range
31 pages
Opinion Essay Structure - Student
No ratings yet
Opinion Essay Structure - Student
10 pages
CV MinhHieu
No ratings yet
CV MinhHieu
2 pages
Account Statements-Apr
No ratings yet
Account Statements-Apr
2 pages
Module 8 Inventory Excel Template
No ratings yet
Module 8 Inventory Excel Template
17 pages
Podani: Taxonomy in Evolutionary Perspective (Synbiol. Hung. 6:1-42, 2010)
No ratings yet
Podani: Taxonomy in Evolutionary Perspective (Synbiol. Hung. 6:1-42, 2010)
42 pages
Mql5 Cookbook: Writing The History of Deals To A File and Creating Balance Charts For Each Symbol in Excel
No ratings yet
Mql5 Cookbook: Writing The History of Deals To A File and Creating Balance Charts For Each Symbol in Excel
14 pages
Hydroponic System With Automated Hydrolysis Using Renewable Energy Self-Sustainable
No ratings yet
Hydroponic System With Automated Hydrolysis Using Renewable Energy Self-Sustainable
6 pages
Ai Seminar Report
No ratings yet
Ai Seminar Report
17 pages