0% found this document useful (0 votes)
105 views

How To Run Wordcount Program in EC2

The document logs a user accessing an EC2 instance and performing operations on an HDFS filesystem. The key events are: 1) The user creates directories and files in HDFS filesystem locations like /user/ramg and /user/ramg/wc. 2) Issues with permissions are encountered when trying to create directories as certain users. 3) The hdfs user creates and modifies permissions on directories to allow other users access. 4) The ec2-user tries compiling a WordCount Java program but encounters errors due to missing Hadoop library dependencies.

Uploaded by

Ram Guggul
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views

How To Run Wordcount Program in EC2

The document logs a user accessing an EC2 instance and performing operations on an HDFS filesystem. The key events are: 1) The user creates directories and files in HDFS filesystem locations like /user/ramg and /user/ramg/wc. 2) Issues with permissions are encountered when trying to create directories as certain users. 3) The hdfs user creates and modifies permissions on directories to allow other users access. 4) The ec2-user tries compiling a WordCount Java program but encounters errors due to missing Hadoop library dependencies.

Uploaded by

Ram Guggul
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 8

login as: ec2-user

Authenticating with public key "imported-openssh-key"


Last login: Sat Feb 16 15:18:20 2019 from 183.83.183.187
[ec2-user@ip-10-0-0-157 ~]$ uptime
16:27:58 up 55 min, 1 user, load average: 0.05, 0.18, 0.69
[ec2-user@ip-10-0-0-157 ~]$ ls -l
total 0
[ec2-user@ip-10-0-0-157 ~]$ sudo -i
[root@ip-10-0-0-157 ~]# ls -l
total 536
-rw-------. 1 root root 7752 Mar 23 2018 anaconda-ks.cfg
-rwxr-xr-x. 1 root root 521190 Apr 9 2018 cloudera-manager-installer.bin
-rw-r--r-- 1 root root 6140 Nov 12 2015 mysql-community-release-el7-5.noarch
.rpm
-rw-------. 1 root root 7080 Mar 23 2018 original-ks.cfg
[root@ip-10-0-0-157 ~]# ls -l
total 536
-rw-------. 1 root root 7752 Mar 23 2018 anaconda-ks.cfg
-rwxr-xr-x. 1 root root 521190 Apr 9 2018 cloudera-manager-installer.bin
-rw-r--r-- 1 root root 6140 Nov 12 2015 mysql-community-release-el7-
5.noarch.rpm
-rw-------. 1 root root 7080 Mar 23 2018 original-ks.cfg
[root@ip-10-0-0-157 ~]# hadoop fs -ls /user
Found 4 items
drwxrwxrwx - mapred hadoop 0 2019-02-16 14:34 /user/history
drwxrwxr-t - hive hive 0 2019-02-16 14:35 /user/hive
drwxrwxr-x - hue hue 0 2019-02-16 14:36 /user/hue
drwxrwxr-x - oozie oozie 0 2019-02-16 14:36 /user/oozie
You have new mail in /var/spool/mail/root
[root@ip-10-0-0-157 ~]# sudo su hdfs
[hdfs@ip-10-0-0-157 root]$ hadoop fs -mkdir -p /user/ramg
[hdfs@ip-10-0-0-157 root]$ hadoop fs -ls /user
Found 5 items
drwxrwxrwx - mapred hadoop 0 2019-02-16 14:34 /user/history
drwxrwxr-t - hive hive 0 2019-02-16 14:35 /user/hive
drwxrwxr-x - hue hue 0 2019-02-16 14:36 /user/hue
drwxrwxr-x - oozie oozie 0 2019-02-16 14:36 /user/oozie
drwxr-xr-x - hdfs supergroup 0 2019-02-16 16:44 /user/ramg
[hdfs@ip-10-0-0-157 root]$ hadoop fs -chown ramg /user/ramg
[hdfs@ip-10-0-0-157 root]$ hadoop fs -ls /user
Found 5 items
drwxrwxrwx - mapred hadoop 0 2019-02-16 14:34 /user/history
drwxrwxr-t - hive hive 0 2019-02-16 14:35 /user/hive
drwxrwxr-x - hue hue 0 2019-02-16 14:36 /user/hue
drwxrwxr-x - oozie oozie 0 2019-02-16 14:36 /user/oozie
drwxr-xr-x - ramg supergroup 0 2019-02-16 16:44 /user/ramg
[hdfs@ip-10-0-0-157 root]$ hadoop fs -chmod 777 /user/ramg
[hdfs@ip-10-0-0-157 root]$ ls -l
ls: cannot open directory .: Permission denied
[hdfs@ip-10-0-0-157 root]$ ls -l
ls: cannot open directory .: Permission denied
[hdfs@ip-10-0-0-157 root]$ exit
exit
[root@ip-10-0-0-157 ~]# exit
logout
[ec2-user@ip-10-0-0-157 ~]$ hadoop fs -mkdir -p /user/ramg/wc
[ec2-user@ip-10-0-0-157 ~]$ ls -l
total 0
[ec2-user@ip-10-0-0-157 ~]$ hadoop fs -mkdir -p /user/ramg/wc/input
[ec2-user@ip-10-0-0-157 ~]$ hadoop fs -ls /user/ramg
Found 1 items
drwxr-xr-x - ec2-user supergroup 0 2019-02-16 16:52 /user/ramg/wc
[ec2-user@ip-10-0-0-157 ~]$ hadoop fs -ls /user/ramg/wc
Found 1 items
drwxr-xr-x - ec2-user supergroup 0 2019-02-16 16:52 /user/ramg/wc/input
[ec2-user@ip-10-0-0-157 ~]$ hadoop fs -mkdir /user/ec2-user
mkdir: Permission denied: user=ec2-user, access=WRITE,
inode="/user":hdfs:supergroup:drwxr-xr-x
[ec2-user@ip-10-0-0-157 ~]$ sudo -i hadoop fs -mkdir /user/ec2-user
mkdir: Permission denied: user=root, access=WRITE,
inode="/user":hdfs:supergroup:drwxr-xr-x
[ec2-user@ip-10-0-0-157 ~]$ ls -l
total 0
[ec2-user@ip-10-0-0-157 ~]$ sudo -i
[root@ip-10-0-0-157 ~]# hadoop fs -mkdir /user/ec2-user
mkdir: Permission denied: user=root, access=WRITE,
inode="/user":hdfs:supergroup:drwxr-xr-x
[root@ip-10-0-0-157 ~]# exit
logout
[ec2-user@ip-10-0-0-157 ~]$ id
uid=1000(ec2-user) gid=1000(ec2-user) groups=1000(ec2-
user),4(adm),10(wheel),190(systemd-journal)
[ec2-user@ip-10-0-0-157 ~]$ pwd
/home/ec2-user
[ec2-user@ip-10-0-0-157 ~]$ hadoop fs -chmod 770 /home/ec2-user
chmod: `/home/ec2-user': No such file or directory
[ec2-user@ip-10-0-0-157 ~]$ ls -l
total 0
[ec2-user@ip-10-0-0-157 ~]$ cd ..
[ec2-user@ip-10-0-0-157 home]$ ls -l
total 0
drwx------. 3 ec2-user ec2-user 95 Jun 14 2018 ec2-user
[ec2-user@ip-10-0-0-157 home]$ hadoop fs -chmod 770 /home/ec2-user
chmod: `/home/ec2-user': No such file or directory
[ec2-user@ip-10-0-0-157 home]$ hadoop fs -chmod 770 ec2-user
chmod: `ec2-user': No such file or directory
[ec2-user@ip-10-0-0-157 home]$ sudo su hdfs
[hdfs@ip-10-0-0-157 home]$ cd
[hdfs@ip-10-0-0-157 ~]$ hadoop fs -mkdir /user/ec2-user
[hdfs@ip-10-0-0-157 ~]$ hadoop fs -chown ec2-user /user/ec2-user
[hdfs@ip-10-0-0-157 ~]$ ls -l
total 0
[hdfs@ip-10-0-0-157 ~]$ cd ..
[hdfs@ip-10-0-0-157 lib]$ exit
exit
[ec2-user@ip-10-0-0-157 home]$ cd
[ec2-user@ip-10-0-0-157 ~]$ ls -l
total 16
-rw-rw-r-- 1 ec2-user ec2-user 21 Feb 16 17:09 file0.txt.txt
-rw-rw-r-- 1 ec2-user ec2-user 29 Feb 16 17:10 file1.txt.txt
-rw-rw-r-- 1 ec2-user ec2-user 33 Feb 16 17:11 file3.txt.txt
-rw-rw-r-- 1 ec2-user ec2-user 3597 Feb 16 17:04 WordCount.java
[ec2-user@ip-10-0-0-157 ~]$ mkdir -p build
[ec2-user@ip-10-0-0-157 ~]$ ls -l
total 16
drwxrwxr-x 2 ec2-user ec2-user 6 Feb 16 17:21 build
-rw-rw-r-- 1 ec2-user ec2-user 21 Feb 16 17:09 file0.txt.txt
-rw-rw-r-- 1 ec2-user ec2-user 29 Feb 16 17:10 file1.txt.txt
-rw-rw-r-- 1 ec2-user ec2-user 33 Feb 16 17:11 file3.txt.txt
-rw-rw-r-- 1 ec2-user ec2-user 3597 Feb 16 17:04 WordCount.java
[ec2-user@ip-10-0-0-157 ~]$ javac -cp /usr/lib/hadoop/*:/usr/lib/hadoop-mapreduce/*
WordCount.java -d build -Xlint
warning: [path] bad path element "/usr/lib/hadoop/*": no such file or directory
warning: [path] bad path element "/usr/lib/hadoop-mapreduce/*": no such file or
directory
WordCount.java:4: error: package org.apache.hadoop.conf does not exist
import org.apache.hadoop.conf.Configuration;
^
WordCount.java:5: error: package org.apache.hadoop.fs does not exist
import org.apache.hadoop.fs.Path;
^
WordCount.java:6: error: package org.apache.hadoop.io does not exist
import org.apache.hadoop.io.IntWritable;
^
WordCount.java:7: error: package org.apache.hadoop.io does not exist
import org.apache.hadoop.io.Text;
^
WordCount.java:8: error: package org.apache.hadoop.mapreduce does not exist
import org.apache.hadoop.mapreduce.Job;
^
WordCount.java:9: error: package org.apache.hadoop.mapreduce does not exist
import org.apache.hadoop.mapreduce.Mapper;
^
WordCount.java:10: error: package org.apache.hadoop.mapreduce does not exist
import org.apache.hadoop.mapreduce.Reducer;
^
WordCount.java:11: error: package org.apache.hadoop.mapreduce.lib.input does not
exist
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
^
WordCount.java:12: error: package org.apache.hadoop.mapreduce.lib.output does not
exist
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
^
WordCount.java:19: error: cannot find symbol
extends Mapper<Object, Text, Text, IntWritable>{
^
symbol: class Mapper
location: class WordCount
WordCount.java:19: error: cannot find symbol
extends Mapper<Object, Text, Text, IntWritable>{
^
symbol: class Text
location: class WordCount
WordCount.java:19: error: cannot find symbol
extends Mapper<Object, Text, Text, IntWritable>{
^
symbol: class Text
location: class WordCount
WordCount.java:19: error: cannot find symbol
extends Mapper<Object, Text, Text, IntWritable>{
^
symbol: class IntWritable
location: class WordCount
WordCount.java:26: error: cannot find symbol
private final static IntWritable one = new IntWritable(1);
^
symbol: class IntWritable
location: class TokenizerMapper
WordCount.java:28: error: cannot find symbol
private Text word = new Text();
^
symbol: class Text
location: class TokenizerMapper
WordCount.java:31: error: cannot find symbol
public void map(Object key, Text value, Context context
^
symbol: class Text
location: class TokenizerMapper
WordCount.java:31: error: cannot find symbol
public void map(Object key, Text value, Context context
^
symbol: class Context
location: class TokenizerMapper
WordCount.java:60: error: cannot find symbol
extends Reducer<Text,IntWritable,Text,IntWritable> {
^
symbol: class Reducer
location: class WordCount
WordCount.java:60: error: cannot find symbol
extends Reducer<Text,IntWritable,Text,IntWritable> {
^
symbol: class Text
location: class WordCount
WordCount.java:60: error: cannot find symbol
extends Reducer<Text,IntWritable,Text,IntWritable> {
^
symbol: class IntWritable
location: class WordCount
WordCount.java:60: error: cannot find symbol
extends Reducer<Text,IntWritable,Text,IntWritable> {
^
symbol: class Text
location: class WordCount
WordCount.java:60: error: cannot find symbol
extends Reducer<Text,IntWritable,Text,IntWritable> {
^
symbol: class IntWritable
location: class WordCount
WordCount.java:67: error: cannot find symbol
private IntWritable result = new IntWritable();
^
symbol: class IntWritable
location: class IntSumReducer
WordCount.java:71: error: cannot find symbol
public void reduce(Text key, Iterable<IntWritable> values,
^
symbol: class Text
location: class IntSumReducer
WordCount.java:71: error: cannot find symbol
public void reduce(Text key, Iterable<IntWritable> values,
^
symbol: class IntWritable
location: class IntSumReducer
WordCount.java:72: error: cannot find symbol
Context context
^
symbol: class Context
location: class IntSumReducer
WordCount.java:26: error: cannot find symbol
private final static IntWritable one = new IntWritable(1);
^
symbol: class IntWritable
location: class TokenizerMapper
WordCount.java:28: error: cannot find symbol
private Text word = new Text();
^
symbol: class Text
location: class TokenizerMapper
WordCount.java:67: error: cannot find symbol
private IntWritable result = new IntWritable();
^
symbol: class IntWritable
location: class IntSumReducer
WordCount.java:79: error: cannot find symbol
for (IntWritable val : values) {
^
symbol: class IntWritable
location: class IntSumReducer
WordCount.java:98: error: cannot find symbol
Configuration conf = new Configuration();
^
symbol: class Configuration
location: class WordCount
WordCount.java:98: error: cannot find symbol
Configuration conf = new Configuration();
^
symbol: class Configuration
location: class WordCount
WordCount.java:99: error: cannot find symbol
Job job = Job.getInstance(conf, "word count");
^
symbol: class Job
location: class WordCount
WordCount.java:99: error: cannot find symbol
Job job = Job.getInstance(conf, "word count");
^
symbol: variable Job
location: class WordCount
WordCount.java:104: error: cannot find symbol
job.setOutputKeyClass(Text.class);
^
symbol: class Text
location: class WordCount
WordCount.java:105: error: cannot find symbol
job.setOutputValueClass(IntWritable.class);
^
symbol: class IntWritable
location: class WordCount
WordCount.java:106: error: cannot find symbol
FileInputFormat.addInputPath(job, new Path(args[0]));
^
symbol: class Path
location: class WordCount
WordCount.java:106: error: cannot find symbol
FileInputFormat.addInputPath(job, new Path(args[0]));
^
symbol: variable FileInputFormat
location: class WordCount
WordCount.java:107: error: cannot find symbol
FileOutputFormat.setOutputPath(job, new Path(args[1]));
^
symbol: class Path
location: class WordCount
WordCount.java:107: error: cannot find symbol
FileOutputFormat.setOutputPath(job, new Path(args[1]));
^
symbol: variable FileOutputFormat
location: class WordCount
40 errors
2 warnings
[ec2-user@ip-10-0-0-157 ~]$ javac -cp
/opt/cloudera/parcels/CDH/lib/hadoop/*:/opt/cloudera/parcels/CDH/lib/hadoop-
mapreduce/* \
> ^C WordCount.java -d build -Xlint
[ec2-user@ip-10-0-0-157 ~]$ javac -cp
/opt/cloudera/parcels/CDH/lib/hadoop/*:/opt/cloudera/parcels/CDH/lib/hadoop-
mapreduce/WordCount.java -d build -Xlint
javac: no source files
Usage: javac <options> <source files>
use -help for a list of possible options
[ec2-user@ip-10-0-0-157 ~]$ javac -cp
/opt/cloudera/parcels/CDH/lib/hadoop/*:/opt/cloudera/parcels/CDH/lib/hadoop-
mapreduce/* \
> WordCount.java -d build -Xlint
warning: [path] bad path element "/opt/cloudera/parcels/CDH/lib/hadoop-
mapreduce/jaxb-api.jar": no such file or directory
warning: [path] bad path element "/opt/cloudera/parcels/CDH/lib/hadoop-
mapreduce/activation.jar": no such file or directory
warning: [path] bad path element "/opt/cloudera/parcels/CDH/lib/hadoop-
mapreduce/jsr173_1.0_api.jar": no such file or directory
warning: [path] bad path element "/opt/cloudera/parcels/CDH/lib/hadoop-
mapreduce/jaxb1-impl.jar": no such file or directory
4 warnings
[ec2-user@ip-10-0-0-157 ~]$ jar -cvf wordcount.jar -C build/ .
added manifest
adding: WordCount$TokenizerMapper.class(in = 1736) (out= 755)(deflated 56%)
adding: WordCount$IntSumReducer.class(in = 1739) (out= 741)(deflated 57%)
adding: WordCount.class(in = 1501) (out= 809)(deflated 46%)
[ec2-user@ip-10-0-0-157 ~]$ ls
build file0.txt.txt file1.txt.txt file3.txt.txt wordcount.jar WordCount.java
[ec2-user@ip-10-0-0-157 ~]$ hadoop fs -put file* /user/ramg/wc/input
[ec2-user@ip-10-0-0-157 ~]$ hadoop jar wordcount.jar WordCount
/user/ramg/wc/input /user/ramg/wc/output
19/02/16 17:34:35 INFO client.RMProxy: Connecting to ResourceManager at ip-10-0-0-
157.ec2.internal/10.0.0.157:8032
19/02/16 17:34:35 WARN mapreduce.JobResourceUploader: Hadoop command-line option
parsing not performed. Implement the Tool interface and execute your application
with ToolRunner to remedy this.
19/02/16 17:34:36 INFO input.FileInputFormat: Total input paths to process : 3
19/02/16 17:34:36 INFO mapreduce.JobSubmitter: number of splits:3
19/02/16 17:34:37 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_1550335222361_0001
19/02/16 17:34:37 INFO impl.YarnClientImpl: Submitted application
application_1550335222361_0001
19/02/16 17:34:37 INFO mapreduce.Job: The url to track the job: http://ip-10-0-0-
157.ec2.internal:8088/proxy/application_1550335222361_0001/
19/02/16 17:34:37 INFO mapreduce.Job: Running job: job_1550335222361_0001
19/02/16 17:34:45 INFO mapreduce.Job: Job job_1550335222361_0001 running in uber
mode : false
19/02/16 17:34:45 INFO mapreduce.Job: map 0% reduce 0%
19/02/16 17:34:54 INFO mapreduce.Job: map 67% reduce 0%
19/02/16 17:34:55 INFO mapreduce.Job: map 100% reduce 0%
19/02/16 17:35:03 INFO mapreduce.Job: map 100% reduce 50%
19/02/16 17:35:04 INFO mapreduce.Job: map 100% reduce 100%
19/02/16 17:35:07 INFO mapreduce.Job: Job job_1550335222361_0001 completed
successfully
19/02/16 17:35:07 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=159
FILE: Number of bytes written=747241
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=491
HDFS: Number of bytes written=80
HDFS: Number of read operations=15
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Job Counters
Launched map tasks=3
Launched reduce tasks=2
Data-local map tasks=3
Total time spent by all maps in occupied slots (ms)=18711
Total time spent by all reduces in occupied slots (ms)=12452
Total time spent by all map tasks (ms)=18711
Total time spent by all reduce tasks (ms)=12452
Total vcore-milliseconds taken by all map tasks=18711
Total vcore-milliseconds taken by all reduce tasks=12452
Total megabyte-milliseconds taken by all map tasks=19160064
Total megabyte-milliseconds taken by all reduce tasks=12750848
Map-Reduce Framework
Map input records=3
Map output records=18
Map output bytes=158
Map output materialized bytes=275
Input split bytes=408
Combine input records=18
Combine output records=17
Reduce input groups=12
Reduce shuffle bytes=275
Reduce input records=17
Reduce output records=12
Spilled Records=34
Shuffled Maps =6
Failed Shuffles=0
Merged Map outputs=6
GC time elapsed (ms)=268
CPU time spent (ms)=4820
Physical memory (bytes) snapshot=1834561536
Virtual memory (bytes) snapshot=7990865920
Total committed heap usage (bytes)=2321547264
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=83
File Output Format Counters
Bytes Written=80
[ec2-user@ip-10-0-0-157 ~]$ hadoop fs -cat /user/ramg/wc/output
cat: `/user/ramg/wc/output': Is a directory
[ec2-user@ip-10-0-0-157 ~]$ hadoop fs -cat /user/ramg/wc/output/*
a 1
an 1
be 1
elephant 1
fellow 1
hadoop 3
oh 1
as 2
can 1
is 3
what 1
yellow 2
[ec2-user@ip-10-0-0-157 ~]$

You might also like