0% found this document useful (0 votes)
4 views

Ccs335 Cloud Computing Lab Mannual

The document outlines the Cloud Computing Laboratory course at Mohamed Sathak A.J College of Engineering for the academic year 2023-2024, detailing the structure and requirements for practical examinations. It includes a bonafide certificate template, an index of experiments, and step-by-step procedures for various tasks such as installing virtual machines, compiling C programs, and developing web applications using Python and Java. Additionally, it describes simulating cloud scenarios with CloudSim and running scheduling algorithms.

Uploaded by

kalai2404.info
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Ccs335 Cloud Computing Lab Mannual

The document outlines the Cloud Computing Laboratory course at Mohamed Sathak A.J College of Engineering for the academic year 2023-2024, detailing the structure and requirements for practical examinations. It includes a bonafide certificate template, an index of experiments, and step-by-step procedures for various tasks such as installing virtual machines, compiling C programs, and developing web applications using Python and Java. Additionally, it describes simulating cloud scenarios with CloudSim and running scheduling algorithms.

Uploaded by

kalai2404.info
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 54

MOHAMED SATHAK A.

J COLLEGE OF ENGINEERING

(Approved by AICTE,New Delhi and Affiliated to Anna University)


34, Old Mahabalipuram Road, Egattur, Chennai – 603103.

ISO 9001:2008 Certified Institution

Sponsored By: MOHAMED SATHAK TRUST, Chennai –600034.

(ACADEMICYEAR: 2023-2024)

CCS335 – CLOUD COMPUTING LABORATORY

Name :

Register Number :

Department : INFORMATION TECHNOLOGY

Year/Semester : III/VI

(ANNA UNIVERSITY:CHENNAI-600025)
MOHAMED SATHAK A.J COLLEGE OFENGINEERING
(Approved by AICTE,New Delhi and Affiliated to Anna University)34,Old
Mahabalipuram Road,Egattur,Chennai–603103.

ISO9001:2008 Certified Institution

Sponsored By: MOHAMED SATHAK TRUST,

Chennai–600034.

BONAFIDE CERTIFICATE

This is to certify that this is the Bonafide Record of work done by


Mr./Ms......................................................……………….

Register Number..................................................................................................... of

IIIYEAR/VI Semester – INFORMATION TECHNOLOGY


CCS335 – CLOUD COMPUTING LABORATORY during the academic year 2023-2024.

StaffIn-Charge Head of the Department

Submitted for the Anna University B.E Practical Examination held on.
.................................

Internal Examiner External Examiner


INDEX PAGE
LIST OF EXPERIMENTS

Ex.No Date Name of the Experiment Page No Signature

10

3
EX.NO:1 VIRTUALBOX/VMWARE WORKSTATION INSTALLATION
Date :

AIM:
To install the Virtual box or VMware Workstation and to create and run virtual
machine with different flavours of Linux or Windows OS on top of Windows 7 or 8.

PROCEDURE:

STEP 1 : INSTALLATION OF VMWARE:


 Download the latest version of VMware from the official website of VMware.
 Run the setup file
 Select the location of the VMware primary folder in your disk.
 Install the VMware.
 Open the VMware application.

STEP 2:DOWNLOAD THE ISO FILES:


 Select the OS which you want to install as a virtual OS from the list of the
following:
a) Windows Vista
b) Windows XP Professional
c) Windows 7
d) Windows 8
e) Ubuntu 20.04 or other versions4
f) Kali Linux
g) CentOS
 Download the ISO file of the OS which you have selected.
 Locate the directory of the ISO file in the VMware application itself by
creating a new folder inside the VMware directory itself.
 Then save the ISO file in that folder itself.

STEP 3:CREATE A NEW VIRTUAL MACHINE

Click the Create a New Virtual Machine icon displayed on the center of the screen
or go to File ---> New Virtual Machine or press Ctrl + N to create a new virtual
machine.

STEP 4: TYPE OF CONFIGURATION

Select type of configuration you want (mostly Typical type is preferred for all) and
click next.

5
STEP 5: GUEST OPERATING SYSTEM INFORMATION

Choose the option of installation from Installation Disc Image File and browse the
location of the ISO file of the desired OS you want and click Next.

SELECTED OS: Ubuntu

STEP 6: EASY INSTALL INFORMATION


If you are installing Linux, give the Full name, Username , Password and confirm
the password again.

STEP 7: NAME THE VIRTUAL MACHINE

Choose the name for Virtual machine and it’s Location.

STEP 8 : SPECIFY DISK CAPACITY

Give the size of your Virtual disk and choose the option Store virtual disk as a single
file or Split virtual disk as multiple files.Recommended size for Ubuntu 64-bit is 20 GB.

7
STEP 9 : READY TO CREATE VIRTUAL MACHINE

Verify the details for virtual machine and click Finish to create the virtual
machine and start installing Ubuntu 64 -bit and then VMware Tools.

8
9
OUTPUT:

1
0
1
1
RESULT:

EX.NO: 2 INSTALL C COMPILER IN VIRTUAL MACHINE AND Date :


EXECUTE PROGRAMS

PROGRAM:

#include <stdio.h>

int check_anagram(char [], char []);

int main()
{
char a[1000], b[1000];

printf("Enter two strings\n");

scanf(“%s”,a);
scanf(“%s”,b);

if (check_anagram(a, b))

printf("The strings are anagrams.\n");

else

printf("The strings aren't anagrams.\n");

return 0;
}
int check_anagram(char a[], char b[])
{
int first[26] = {0}, second[26] = {0}, c=0;
1
while (a[c] != '\0') { 2

first[a[c]-'a']++;
c++;
}
c = 0;
while (b[c] != '\0') {

second[b[c]-'a']++;
c++;

for (c = 0; c < 26; c++)

if (first[c] != second[c])

return 0;

return 1;

PROCEDURE:

STEP 1: OPEN VIRTUAL MACHINE

Open the Ubuntu virtual machine.

STEP 2 : INSTALL C COMPILER

 Right click and select Open Terminal .The terminal windows opens.
 Type the following command to install the C Compiler: sudo apt-get install gcc
 Type the password to allow the installation.
1
3
STEP 3: EXECUTE PROGRAM
 Open the terminal and open the gedit editor using the command : gedit filename.c

 Type the C program and save the file.

1
4
 Compile the C program using the following command : gcc filename.c -o filename

 Run the C program using the command : ./filename


1
5
OUTPUT:

1
6
RESULT:

EX.NO: 2 INSTALL C COMPILER IN VIRTUAL MACHINE AND Date :


EXECUTE PROGRAMS

PROGRAM: 1
7
#include <stdio.h>
int check_anagram(char [], char []);

int main()
{
char a[1000], b[1000];

printf("Enter two strings\n");

scanf(“%s”,a);
scanf(“%s”,b);

if (check_anagram(a, b))

printf("The strings are anagrams.\n");

else

printf("The strings aren't anagrams.\n");

return 0;
}
int check_anagram(char a[], char b[])
{
int first[26] = {0}, second[26] = {0}, c=0;

while (a[c] != '\0') {

first[a[c]-'a']++;
c++;
}
c = 0;
while (b[c] != '\0') {

second[b[c]-'a']++;
c++;

for (c = 0; c < 26; c++)

if (first[c] != second[c])

return 0;

return 1;

}
1
PROCEDURE: 8

STEP 1: OPEN VIRTUAL MACHINE


Open the Ubuntu virtual machine.

STEP 2 : INSTALL C COMPILER

 Right click and select Open Terminal .The terminal windows opens.
 Type the following command to install the C Compiler: sudo apt-get install gcc
 Type the password to allow the installation.

STEP 3: EXECUTE PROGRAM


 Open the terminal and open the gedit editor using the command : gedit filename.c

1
9
 Type the C program and save the file.

 Compile the C program using the following command : gcc filename.c -o filename

2
0
 Run the C program using the command : ./filename

2
1
OUTPUT:

RESULT:

2
EX.NO: 3.b) DEVELOP CELSIUS TO FAHRENHEIT WEB APPLICATION
2
Date: USING PYTHON / JAVA
t est1.py:
import webapp
def convert_temp(cel_temp):
''' convert Celsius temperature to Fahrenheit
temperature ''' if cel_temp == "":
ret
urn ""
try:
far_temp = float(cel_temp) * 9 / 5 + 32
far_temp = round(far_temp, 3) # round to three decimal places
return str(far_temp)
except ValueError: # user entered non-numeric temperature
return "invalid input"
class
MainPage(webapp2.RequestHandle
r): def get(self):
cel_temp =
self.request.get("cel_temp")
far_temp =
convert_temp(cel_temp)
self.response.headers["Content-Type"] =
"text/html"
self.response.headers["Content-Type"] =
"text/html" self.response.write(""")
<html><head><title>Temperature Converter</title></head>
<body>
<form action="/" method="get">
Celsius temperature: <input type="text"
name="cel_temp" value={}>
<input type="submit"
value="Convert"><br>
Fahrenheit temperature: {}
</form>

</body>
</html>""".format(cel_temp, far_temp))
routes = [('/', MainPage)]
my_app = webapp2.WSGIApplication(routes, debug=True)
14. Create a text document and type the yaml file code and save it as app.yaml
a pp.yaml:
runtime:python27
api_version: 1
threadsafe: true
handlers:
- url: /
script: test1.my_app
15. After creating both .yaml and .py file, now we have to execute this python file by running
dev_appserver.py file in bin folder by using this command “google-cloud- sdk\bin\
dev_appserver.py
2
C:\Users\DELL\Desktop\test”. 3
16. Select ‘Y’ if you want to install the components. And automatically after installing
the components the python file is executed and hosted.
RESULT:

EX.NO : 4 LAUNCH WEB APPS USING GOOGLE APP ENGINE


Date: LAUNCHER

AIM :
To use GAE Launcher to launch the web applications.

PROCEDURE :

Step 1: Download the original App Engine2 SDK for Python 2


4
Step 2 :Install the AppEngine SDK .Make sure that python 2.7 is installed. Click Next

 Accept the terms and conditions for the installation. Click Next.

2
5
 Choose the destination folder to install.Click Next.

 Click Install to complete the installation process.


2
6
2
7
Step 3:

 Open the Google App Engine Launcher.Click Create New Application. In


the pop-up window , enter the Application name and choose the Parent
directory.Click Create.

2
8
Step 4:
Click the application name and click Run icon.After application runs, the symbol
on left of application name becomes green in colour.Note the Port number and type in
the browser as : localhost:portnumber

PROGRAM:
PYTHON PROGRAM(GCD):

#!/usr/bin/env python

import webapp2

class MainHandler(webapp2.RequestHandler):
def get(self):
self.html = ''' <html><body><h2> GCD</h2><form action = "/submit" method = "get">
Enter Number 1 <input type = "text" name = "A" id = "numA"> <br><br>
Enter Number 2 <input type = "text" name = "B" id = "numB"> <br> <br>
<input type = "submit" value = "Submit"> <br> </form> </body> </html> '''

self.response.write(self.html)
return self.response;
2
class SubmitHandler(webapp2.RequestHandler): 9
def get(self):

self.a = int(self.request.get('A'))
self.b = int(self.request.get('B'))
self.res = self.gcd(self.a,self.b)
self.response.write("The GCD of "+ str(self.a) + " and " + str(self.b) +" is \n")
self.response.write(self.res)
return self.response;

def gcd(self,a,b):
if (self.a > self.b):
self.small = b
else:
self.small = a

for self.i in range(1,self.small+1):


if (self.a % self.i == 0 and self.b % self.i == 0):
self.gcd = self.i

return self.gcd

FILENAME.YAML
app = webapp2.WSGIApplication([
('/', MainHandler),
('/submit',SubmitHandler),
], debug=True)

OUTPUT

3
0
RESULT:

3
1
EX.NO: 5 TO SIMULATE A CLOUD SCENARIO USING CLOUDSIM
Date: AND RUN A SCHEDULING ALGORITHM

PROGRAM:

package cloudsimexample5;
import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.LinkedList;
import java.util.List;

import org.cloudbus.cloudsim.Cloudlet;
import org.cloudbus.cloudsim.CloudletSchedulerTimeShared;
import org.cloudbus.cloudsim.Datacenter;
import org.cloudbus.cloudsim.DatacenterBroker;
import org.cloudbus.cloudsim.DatacenterCharacteristics;
import org.cloudbus.cloudsim.Host;
import org.cloudbus.cloudsim.Log;
import org.cloudbus.cloudsim.Pe;
import org.cloudbus.cloudsim.Storage;
import org.cloudbus.cloudsim.UtilizationModel;
import org.cloudbus.cloudsim.UtilizationModelFull;
import org.cloudbus.cloudsim.Vm;
import org.cloudbus.cloudsim.VmAllocationPolicySimple;
import org.cloudbus.cloudsim.VmSchedulerSpaceShared;
import org.cloudbus.cloudsim.core.CloudSim;
import org.cloudbus.cloudsim.provisioners.BwProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.PeProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.RamProvisionerSimple;

public class CloudSimExample5 {


private static List<Cloudlet> cloudletList1;
private static List<Cloudlet> cloudletList2;
private static List<Vm> vmlist1;
private static List<Vm> vmlist2;
public static void main(String[] args) {
// TODO code application logic here
System.out.println("Starting CloudSimExample5...");
try {
int num_user = 2;
Calendar calendar = Calendar.getInstance();
boolean trace_flag = false;
CloudSim.init(num_user, calendar, trace_flag);
@SuppressWarnings("unused")
3
Datacenter datacenter0 = createDatacenter("Datacenter_0");
@SuppressWarnings("unused") 2
Datacenter datacenter1 = createDatacenter("Datacenter_1");
DatacenterBroker broker1 = createBroker(1);
int brokerId1 = broker1.getId();
DatacenterBroker broker2 = createBroker(2);
int brokerId2 = broker2.getId();
vmlist1 = new ArrayList<Vm>();
vmlist2 = new ArrayList<Vm>();
int vmid = 0;
int mips = 250;
long size = 10000;
int ram = 512;
long bw = 1000;
int pesNumber = 1;
String vmm = "Xen";
Vm vm1 = new Vm(vmid, brokerId1, mips, pesNumber, ram, bw, size, vmm, new
CloudletSchedulerTimeShared());
Vm vm2 = new Vm(vmid, brokerId2, mips, pesNumber, ram, bw, size, vmm, new
CloudletSchedulerTimeShared());
vmlist1.add(vm1);
vmlist2.add(vm2);
broker1.submitVmList(vmlist1);
broker2.submitVmList(vmlist2);
cloudletList1 = new ArrayList<Cloudlet>();
cloudletList2 = new ArrayList<Cloudlet>();
int id = 0;
long length = 40000;
long fileSize = 300;
long outputSize = 300;
UtilizationModel utilizationModel = new UtilizationModelFull();

Cloudlet cloudlet1 = new Cloudlet(id, length, pesNumber, fileSize, outputSize,


utilizationModel, utilizationModel, utilizationModel);
cloudlet1.setUserId(brokerId1);
id=1;
Cloudlet cloudlet2 = new Cloudlet(id, length, pesNumber, fileSize, outputSize,
utilizationModel, utilizationModel, utilizationModel);
cloudlet2.setUserId(brokerId2);

cloudletList1.add(cloudlet1);
cloudletList2.add(cloudlet2);

broker1.submitCloudletList(cloudletList1);
broker2.submitCloudletList(cloudletList2);

CloudSim.startSimulation();

List<Cloudlet> newList1 = broker1.getCloudletReceivedList();


List<Cloudlet> newList2 = broker2.getCloudletReceivedList();

CloudSim.stopSimulation(); 3
3
System.out.println("=============> User "+brokerId1+" ");
printCloudletList(newList1);
System.out.println("=============> User "+brokerId2+" ");
printCloudletList(newList2);
System.out.println("CloudSimExample5 finished!");
}
catch (Exception e) {
e.printStackTrace();
System.out.println("The simulation has been terminated due to an unexpected
error");
}
}
private static Datacenter createDatacenter(String name){

List<Host> hostList = new ArrayList<Host>();


List<Pe> peList = new ArrayList<Pe>();
int mips=1000;
peList.add(new Pe(0, new PeProvisionerSimple(mips)));
int hostId=0;
int ram = 2048;
long storage = 1000000;
int bw = 10000;

hostList.add(
new Host(
hostId,
new RamProvisionerSimple(ram),
new BwProvisionerSimple(bw),
storage,
peList,
new VmSchedulerSpaceShared(peList)
)
);

String arch = "x86";


String os = "Linux";
String vmm = "Xen";
double time_zone = 10.0;
double cost = 3.0;
double costPerMem = 0.05;
double costPerStorage = 0.001;
double costPerBw = 0.0;
LinkedList<Storage> storageList = new LinkedList<Storage>();

DatacenterCharacteristics characteristics = new DatacenterCharacteristics(


arch, os, vmm, hostList, time_zone, cost, costPerMem, costPerStorage,
costPerBw);

Datacenter datacenter = null; 3


try { 4
datacenter = new Datacenter(name, characteristics, new
VmAllocationPolicySimple(hostList), storageList, 0);
} catch (Exception e) {
e.printStackTrace();
}

return datacenter;
}
private static DatacenterBroker createBroker(int id){

DatacenterBroker broker = null;


try {
broker = new DatacenterBroker("Broker"+id);
} catch (Exception e) {
e.printStackTrace();
return null;
}
return broker;
}

private static void printCloudletList(List<Cloudlet> list) {


int size = list.size();
Cloudlet cloudlet;
String indent = " ";
System.out.println();
System.out.println("========== OUTPUT ==========");
System.out.println("Cloudlet ID" + indent + "STATUS" + indent +
"Data center ID" + indent + "VM ID" + indent + "Time" + indent +
"Start Time" + indent + "Finish Time");
DecimalFormat dft = new DecimalFormat("###.##");
for (int i = 0; i < size; i++) {
cloudlet = list.get(i);
if (cloudlet.getCloudletStatus() == Cloudlet.SUCCESS){
System.out.println(indent + cloudlet.getCloudletId() + indent + indent +
"SUCCESS"+indent + indent +cloudlet.getResourceId() + indent + indent + indent+
indent + cloudlet.getVmId() +
indent + indent + dft.format(cloudlet.getActualCPUTime()) +
indent + indent + dft.format(cloudlet.getExecStartTime())+
indent + indent + dft.format(cloudlet.getFinishTime()));
}
}

}
}

3
5
OUTPUT :

RESULT:

3
6
EX.NO: 6 TRANSFER A FILE BETWEEN TWO VIRTUAL MACHINES
Date:

PROCEDURE:

● Open the VMware workstation.

● Before turning on the virtual machines, create a shared folder in the host OS

● Choose the people whom you want to share with. Select as everyone.

3
7
● Configure the Shared folder settings.

3
8
 Click the First Virtual machine and goto VM - > Settings. Goto Options and choose
Shared Folders and click Always Enabled.Add the folder which we want to share.
● Click Files and then goto Other locations.Choose Computer ,goto mnt -> hfgs ,so that
the shared folder is visible.

● Now open the file file.txt and edit it’s contents.


3
9
● Save the file and turn off the first virtual machine.

● Turn on the Second Virtual machine and click Files and then click Other
locations.Choose Computer ,goto mnt -> hfgs. 4
0
● Open the shared folder which you have seen in the previous Virtual machine. Open
the folder and open the text file.

● Finally you can see that the file has been transferred from one virtual machine
through another virtual machine by using the shared folder method.

OUTPUT:

RESULT:

4
1
Ex.No:7. HADOOP SINGLE NODE CLUSTER
Date:

Aim:
To find procedure to set up the one node Hadoop cluster.

Procedure
1. Open VM – new -> full name : centos7
username : admin
password : admin
confirm password :admin.
2. Create folder in D drive D:
centos7 Hadoop -> after that goto h/w customs and change memory to 4GB.
3. During installation time create user:
Right corner user creation
Fullname : hadoop.
User : hadoop.
Password : hadoop.
4. Copy hadoop 3.0.3 and JDK 1.8.0 from windows to hadoop user desktop/root
user and extract jar files in desktop itself .
5. Login to Root user : i) $ groupadd cluster
ii) $usermode –aG cluster hadoop.
6. //Open bashrc file and type the following command in the ‘bashrc’ file bottom.
bashrc file present in computer -> home -> Desktop -> Bashrc.

//add the following at the end of file


export JAVA_HOME=jdk1.8.0_45
export HADOOP_HOME=hadoop-2.6.0
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:
$HADOOP/sbin
export PATH

7. Got to root user and execute bashrc file.


$ exec bash.
$ source .bashrc
$ hadoop –version
8. Open the Hadoop folder :
hadoop-env.sh’ file:
hadoop-env.sh (Hadoop_env.sh -> right click and open gedit text editor.)
JAVA_HOME = java.path (copy and paste java home directory path or
java folder path )

9. Copy and Paste the following file to hadoop folder-->etc-->hadoop


Open vim core-site.xml
4
<configuration>
2
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Open vim mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.job.tracker</name>
<value>localhost:54311</value>
</property>
<property>
<name>mapreduce.tasktracker.map.tasks.maximum</name>
<value>4</value>
</property>
<property>
<name>mapreduce.map.tasks</name>
<value>4</value>
</property>
</configuration>
Open vim hdfs-site.xml // to edit the username in this file
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>

<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser(change the hd user) /hdfs/namenode</value>
</property>

<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser(change the hd user) /hdfs/datanode</value>
</property>
</configuration>

10. Key generation(ssh key) it has 3 commands


Ssh -keygen –t rsa –p’ –f ~ /.ssh/id_rsa.
Cot ~/.ssh / id_rsa.pub >>~/.ssh/authorised_keys
Ssh localhost (it will ask questions) like (yes/no) gives yes
Enter password : hadoop
11. Format the hadoop name node
Go to hadoop user login
$hadoop > hdfs namenode –format.4
12. Start hadoop 3
Start all .sh (or) Hadoop – 3.0.3/sbin/start_all.sh
13. Check the nodes that are credited or not
$ jps.
If namenode is created -> go the localhost Open the namenode and datanode.
In browser type the following port number
localhost:50070
localhost:8088
If the namenode is not created execute the below command and repeat steps 11
and 12.
rm --r hdfs/

Output:
NameNode:

DataNode:

4
4
Result:

● Optionally, configure the time zone:


Replace TIME_ZONE with an appropriate time zone identifier.
● Add the following line to /etc/apache2/conf-available/openstack
dashboard.conf if not included.
WSGIApplicationGroup %{GLOBAL}
● Reload the web server configuration by using the command systemctl reload
apache2.service

LAUNCH VIRTUAL MACHINE USING OPENSTACK DASHBOARD:

●Open the Openstack dashboard from the browser.


●Go to Project → Compute → Instances.

●Click "Launch Instance".

4
5
● Insert the name of the Instance (eg. "vm01") and click Next button

●Select Instance Boot Source (eg. "Image"), and choose desired image (eg. "Ubuntu
16.04 LTS") by clicking on arrow. Keep the setting "Create New Volume" feature to
"No" state.

4
6
●Choose Flavour (eg. eo1.xsmall).

●Click "Networks" and then choose desired networks.

●Open "Security Groups" After that, choose "allow_ping_ssh_rdp" and "default".

4
7
●Choose or generate SSH keypair for your VM. Next, launch your instance by
clicking on blue button.

●You will see "Instances" menu with your newly created VM. Open the drop-down
menu and choose "Console".

4
8
●Click on the black terminal area (to activate access to the console). Type: eoconsole
and hit Enter.

● Insert and retype new password.

●Now you can type commands. After you finish, type "exit". This will close the
session.

RESULT:

4
9
Ex.No: 7. b) USE THE HADOOP SINGLE NODE CLUSTER TO
Date: IMPLEMENT WORD COUNT PROGRAM

Program:

import java.io.IOException;
import java.util.*;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.*;
import org.apache.hadoop.mapreduce.lib.output.*;
import org.apache.hadoop.util.*;

public class WordCount extends Configured implements Tool {


public static void main(String args[]) throws Exception {
int res = ToolRunner.run(new WordCount(), args);
System.exit(res);
}
public int run(String[] args) throws Exception {
Path inputPath = new Path(args[0]);
Path outputPath = new Path(args[1]);
Configuration conf = getConf();
Job job = new Job(conf, this.getClass().toString());

FileInputFormat.setInputPaths(job, inputPath);
FileOutputFormat.setOutputPath(job, outputPath);
job.setJobName("WordCount");
job.setJarByClass(WordCount.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setCombinerClass(Reduce.class);
job.setReducerClass(Reduce.class);
return job.waitForCompletion(true) ? 0 : 1; }
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
41
private Text word = new Text();
public void map(LongWritable key, Text value,
Mapper.Context context) throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
} } }
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException,
InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
context.write(key, new IntWritable(sum));
} }}

NameNode:

41
DataNode:

Input File:
a.txt Output:
aaa aaa aaa aaa 3
bbb bbb bbb bbb 3
ccc ccc ccc ccc 3

41
Result:

41

You might also like