0% found this document useful (0 votes)
125 views

CC Hadoop Lab

This document provides instructions for completing two labs on HDFS operations using both the HDFS shell commands and Hadoop APIs in Java. The first part demonstrates basic HDFS operations like uploading and retrieving files using shell commands. The second part provides steps to create and write a file to HDFS as well as upload a local file to HDFS using Java programs and Hadoop APIs.

Uploaded by

Claudia Ardelean
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
125 views

CC Hadoop Lab

This document provides instructions for completing two labs on HDFS operations using both the HDFS shell commands and Hadoop APIs in Java. The first part demonstrates basic HDFS operations like uploading and retrieving files using shell commands. The second part provides steps to create and write a file to HDFS as well as upload a local file to HDFS using Java programs and Hadoop APIs.

Uploaded by

Claudia Ardelean
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Cloud Computing

Hadoop Lab
Lab1: HDFS operations
Part I HDFS Shell basic operation (POSIX-like)
Lab Objective
This part make you be familiar with the operations of HDFS via its shell
commands.
Lab Steps
Step 1:
Make sure that your Hadoop cluster has already set up, and the HDFS is running.
$ bin/start-dfs.sh
Step 2:
In the Hadoop home directory. Enter the following command:
$ bin/hadoop fs
And you can see the following message:

There are HDFS shell commands (POSIX-like) that you can use to operate the
HDFS.
Step 3:
Upload the data to the HDFS.
$ bin/hadoop fs -mkdir input
$ bin/hadoop fs -put conf/ input/
To check out if the data is put in the input directory already.
$ bin/hadoop fs -ls input/

Run the mapreduce job:
$ bin/hadoop jar hadoop-0.20.2-examples.jar wordcount input output
Then retrieve the result from the HDFS, you have two choices:
(1) $ bin/hadoop fs -cat output/part-r-00000
(2) $ bin/hadoop fs -copyToLocal output/ output/
And then, you can see the output directory in the path /opt/hadoop/
Step4:
And you can also delete the file/directory by HDFS shell command. For example,
to delete the file output/part-r-00000:
$ bin/hadoop fs -rm output/part-r-00000
$ bin/hadoop fs -ls output/
And then delete the directory output:
$ bin/hadoop fs -rmr output
$ bin/hadoop fs -ls


Part II Java Program (using APIs)
Lab Objective
In this part, you must learn how to write simple programs using Hadoop APIs to
interact with HDFS.
Lab Steps
Exersice1: Create a file on HDFS.
Step1: Modify the .profile in the user: hadoop home directory.
In order to link to the hadoop library, you need to add the CLASSPATH used by
java compiler.
$ vim ~/.profile
Add the following lines at the end of the file.

CLASSPATH=/opt/hadoop/hadoop-0.20.2-core.jar
export CLASSPATH

And then do relogin.
$ exit
$ su - hadoop

Step2: Open a java file.
Create a file named ex01.java.
$ vim ex01.java
Paste the following program to it.

import org.apache.hadoop.conf.*;
import org.apache.hadoop.fs.*;

public class ex01 {
public static void main(String[] args){
if(args.length < 2){
System.out.print("Need two arguments\n");
System.out.flush();
return;
}
System.out.print("Creat File: " + args[0] + "\n");
System.out.print("Content: " + args[1] + "\n");
System.out.flush();

try {
Configuration conf = new Configuration();
FileSystem hdfs = FileSystem.get( conf );

Path filepath = new Path( args[0] );
FSDataOutputStream output = hdfs.create( filepath );

byte[] buff = args[1].getBytes();
output.write(buff, 0, buff.length);
} catch (Exception e) {
e.printStackTrace();
}
}
}

Step3: Compile ex01.java
$ javac ex01.java
Note: If the project is too large, you may need to pack you program into jar file. You
can do this by the following command.
$ mkdir ex01
$ mv ex01.class ex01/
$ jar cvf ex01.jar -C ex01/ .

Step4: Run the program.
Run the program on hadoop.
$ bin/hadoop ex01 test Hello World
Then you can check the content of test on the HDFS.
$ bin/hadoop fs -cat test
Note: If you pack your program in a jar file. You can run it by this command:
$ bin/hadoop jar ex01.jar ex01 test Hello World

Exercise2: Upload a local file/directory to HDFS
Step1: Open a java file.
Create a file named ex02.java.
$ vim ex02.java
Paste the following program to it.

import org.apache.hadoop.conf.*;
import org.apache.hadoop.fs.*;

public class ex02 {
public static void main( String[] args){
if ( args.length < 2){
System.out.print("Need two arguments\n");
System.out.flush();
return;
}
System.out.print("Source File: " +args[0]+ "\n");
System.out.print("Destination File: " +args[1]+ "\n");
System.out.flush();

try {
Configuration conf = new Configuration();
FileSystem hdfs = FileSystem.get(conf);

Path srcPath = new Path(args[0]);
Path dstPath = new Path(args[1]);

hdfs.copyFromLocalFile(srcPath, dstPath);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Step2: Compile ex02.java
$ javac ex02.java
Step3: Run the program.
Run the program on hadoop.
$ bin/hadoop ex02 conf/ input2/

Then you can check the content of test on the HDFS.
$ bin/hadoop fs -ls input2

You might also like