0% found this document useful (0 votes)

18 views33 pages

BDA Exp Removed Removed

Uploaded by

placementcell1234567890

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views33 pages

BDA Exp Removed Removed

Uploaded by

placementcell1234567890

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Exp no: WORD COUNT MAP REDUCE PROGRAM TO UNDERSTAND

MAP REDUCE PARADIGM

Date:

AIM:

Run a basic Word Count Map Reduce program to understand Map Reduce Paradigm

PREREQUISITES:

• Java Installation:
• Ensure Java Development Kit (JDK) is installed on all nodes of your Hadoop cluster.
• Set the JAVA_HOME environment variable to point to your JDK installation
directory.
• Hadoop Installation:
• Install Apache Hadoop on your cluster. Ensure Hadoop is properly configured and all
nodes are accessible.
• Hadoop HDFS should be up and running, and you should have basic knowledge of
configuring Hadoop properties (core-site.xml, hdfs-site.xml, mapred-site.xml, etc.).
• Development Environment:
• Set up a development environment with Hadoop installed locally if you're testing on a
single-node setup (pseudo-distributed mode).

SOURCE CODE :

import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class wordCount {

public static class Map extends Mapper<LongWritable, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}

public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

public void reduce(Text key, Iterable <IntWritable> values, Context context)

throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();

Job job = new Job(conf, "wordcount"); job.setOutputKeyClass(Text.class);

job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job,
new Path(args[1])); job.waitForCompletion(true);
}
}
OUTPUT:

RESULT:
Exp no: IMPLEMENTING MATRIX MULTIPLICATION WITH
HADOOP MAP REDUCE
Date:

AIM:

Implement of Matrix Multiplication with Hadoop Map Reduce

MAPPING :

CO5:Use Hadoop-related tools such as HBase, Cassandra, Pig, and Hive for big data
analytics.

PREREQUISITES :

Java Installation:
• Ensure Java Development Kit (JDK) is installed on all nodes of your Hadoop cluster.
• Set the JAVA_HOME environment variable to point to your JDK installation
directory.
Hadoop Installation:
• Install Apache Hadoop on your cluster. Ensure Hadoop is properly configured and all
nodes are accessible.
• Hadoop HDFS should be up and running, and you should have basic knowledge of
configuring Hadoop properties (core-site.xml, hdfs-site.xml, mapred-site.xml, etc.).
Development Environment:
• Set up a development environment with Hadoop installed locally if you're testing on a
single-node setup (pseudo-distributed mode).

ALGORITHM FOR MAP FUNCTION :

a. for each element mij of M do produce (key,value) pairs as ((i,k), (M,j,mij), for k=1,2,3,.. upto
the number of columns of N
b. for each element njk of N do produce (key,value) pairs as ((i,k),(N,j,Njk), for i = 1,2,3,.. Upto the
number of rows of M.
c. return Set of (key,value) pairs that each key (i,k), has list with values (M,j,mij) and (N, j,njk) for all
possible values of j.

ALGORITHM FOR REDUCE FUNCTION :

d. for each key (i,k) do
e. sort values begin with M by j in listM sort values begin with N by j in listN multiply mij and njk for
jth value of each list
f. sum up mij x njk return (i,k), Σj=1 mij x njk
STEP 1: Download the hadoop jar files with these links.
Download Hadoop Common Jar files: https://goo.gl/G4MyHp
$ wget https://goo.gl/G4MyHp -O hadoop-common-2.2.0.jar
Download Hadoop Mapreduce Jar File: https://goo.gl/KT8yfB
$ wget https://goo.gl/KT8yfB -O hadoop-mapreduce-client-core-2.7.1.jar

CODING:

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import java.util.ArrayList;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.io.WritableComparable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.*;
import org.apache.hadoop.mapreduce.lib.output.*;
import org.apache.hadoop.util.ReflectionUtils;
class Element implements Writable {
int tag;
int index;
double value;
Element() {
tag = 0;
index = 0;
value = 0.0;
}
Element(int tag, int index, double value) {
this.tag = tag;
this.index = index;
this.value = value;
}
@Override
public void readFields(DataInput input) throws IOException {
tag = input.readInt();
index = input.readInt();
value = input.readDouble();
}
@Override
public void write(DataOutput output) throws IOException {
output.writeInt(tag);
output.writeInt(index);
output.writeDouble(value);
}
}
class Pair implements WritableComparable<Pair> {
int i;
int j;
Pair() {
i = 0;
j = 0;
}
Pair(int i, int j) {
this.i = i;
this.j = j;
}
@Override
public void readFields(DataInput input) throws IOException {
i = input.readInt();
j = input.readInt();
}
@Override
public void write(DataOutput output) throws IOException {
output.writeInt(i);
output.writeInt(j);
}
@Override
public int compareTo(Pair compare) {
if (i > compare.i) {
return 1;
} else if ( i < compare.i) {
return -1;
} else {
if(j > compare.j) {
return 1;
} else if (j < compare.j) {
return -1;
}
}
return 0;
}
public String toString() {
return i + "" + j + "";
}
}
public class Multiply
{
public static class MatriceMapperM extends Mapper<Object,Text,IntWritable,Element>
{ 24 Department of CSE
@Override
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
String readLine = value.toString();
String[] stringTokens = readLine.split(",");
int index = Integer.parseInt(stringTokens[0]);
double elementValue = Double.parseDouble(stringTokens[2]);
Element e = new Element(0, index, elementValue);
IntWritable keyValue = new IntWritable(Integer.parseInt(stringTokens[1]));
context.write(keyValue, e);
}
}
public static class MatriceMapperN extends Mapper<Object,Text,IntWritable,Element> {
@Override
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
String readLine = value.toString();
String[] stringTokens = readLine.split(",");
int index = Integer.parseInt(stringTokens[1]);
double elementValue = Double.parseDouble(stringTokens[2]);
Element e = new Element(1,index, elementValue);
IntWritable keyValue = new IntWritable(Integer.parseInt(stringTokens[0]));
context.write(keyValue, e);
}
}
public static void main(String[] args) throws Exception {
Job job = Job.getInstance();
job.setJobName("MapIntermediate");
job.setJarByClass(Project1.class);
MultipleInputs.addInputPath(job, new Path(args[0]), TextInputFormat.class, MatriceMapperM.class);
MultipleInputs.addInputPath(job, new Path(args[1]), TextInputFormat.class, MatriceMapperN.class);
job.setReducerClass(ReducerMxN.class);
job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(Element.class);
job.setOutputKeyClass(Pair.class);
job.setOutputValueClass(DoubleWritable.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileOutputFormat.setOutputPath(job, new Path(args[2]));
job.waitForCompletion(true);
Job job2 = Job.getInstance();
job2.setJobName("MapFinalOutput");
job2.setJarByClass(Project1.class);
job2.setMapperClass(MapMxN.class);
job2.setReducerClass(ReduceMxN.class);
job2.setMapOutputKeyClass(Pair.class);
job2.setMapOutputValueClass(DoubleWritable.class);
job2.setOutputKeyClass(Pair.class);
job2.setOutputValueClass(DoubleWritable.class);
job2.setInputFormatClass(TextInputFormat.class);
job2.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.setInputPaths(job2, new Path(args[2]));
FileOutputFormat.setOutputPath(job2, new Path(args[3])); job2.waitForCompletion(true);
}
}
#!/bin/bash
rm -rf multiply.jar classes
module load hadoop/2.6.0
mkdir -p classes
javac -d classes -cp classes:`$HADOOP_HOME/bin/hadoop classpath` Multiply.java
jar cf multiply.jar -C classes .
echo "end"
export HADOOP_CONF_DIR=/home/$USER/cometcluster
module load hadoop/2.6.0
myhadoop-configure.sh
start-dfs.sh
start-yarn.sh
hdfs dfs -mkdir -p /user/$USER
hdfs dfs -put M-matrix-large.txt /user/$USER/M-matrix-large.txt
hdfs dfs -put N-matrix-large.txt /user/$USER/N-matrix-large.txt
hadoop jar multiply.jar edu.uta.cse6331.Multiply /user/$USER/M-matrix-large.txt /user/$USER/N-
matrix-large.txt /user/$USER/intermediate /user/$USER/output
rm -rf output-distr
mkdir output-distr
hdfs dfs -get /user/$USER/output/part* output-distr
stop-yarn.sh
stop-dfs.sh
myhadoop-cleanup.sh
OUTPUT:

module load hadoop/2.6.0

rm -rf output intermediate
hadoop --config $HOME jar multiply.jar edu.uta.cse6331.Multiply M-matrix-small.txt N-matrix-
small.txt intermediate output.

RESULT:
Exp no:

HBASE PRACTICE EXAMPLES

Date:

AIM :
To install and implement HBase commands .

PROCEDURE :

Installing Apache HBase involves several steps to ensure proper setup and configuration. Here's a
general procedure for installing HBase:
Prerequisites
• Java Installation:
• Ensure Java Development Kit (JDK) is installed. HBase requires Java 8 or later
versions.
• Set the JAVA_HOME environment variable to point to your JDK installation
directory.
• Hadoop Installation (Optional):
• HBase typically runs on top of Hadoop HDFS. If you haven't installed Hadoop
separately, you can use HBase's standalone mode for development purposes.

COMMANDS

1.Start hbase shell:

Open your terminal or command prompt and run the following command to start the
HBase shell:
Syntax: hbase shell
2.Verify Table Creation:

To verify that the table has been created successfully, you can use the list
command to list all the tables in HBase.
Syntax: list

3.Create a new table:

To create a new table, you'll use the create command followed by the table
name and the names of the column families you want in the table. Here's the
basic syntax:
Syntax: create 'table_name', 'column_family1', 'column_family2', ...

4.Insert the rows into tables

To insert data into an HBase table, you can use the put command in the HBase shell.
Here's the basic syntax for inserting data
Syntax:
put 'table_name', 'row_key', 'column_family:column_qualifier', 'value'

5.Describe the table

This command provides information about the structure of the 'students' table,
including its column families.
Syntax: describe ‘table_name’

6.Disable the table

This command disables (pauses) the 'students' table, preventing any read or write
operations on it.
Syntax: disable ‘table_name’

7.Enable the table

This command re-enables the 'students' table, allowing read and write operations again.
Syntax: enable ‘table_name’

8.Alter the table

This command adds a new column family named 'contact_details' to the 'students' table.
Syntax: alter 'table_name', {NAME => 'new_column_family'}

9.Count rows in table

This command counts and displays the total number of rows in the 'students' table.
Syntax: count ‘table_name’

10.Check if the table exists

This command checks if a table named 'employees' exists. It will return true if the
table exists or false if it doesn't
Syntax: exists ‘table_name’

11.View the table

This command scans and retrieves all data in the 'students' table.

Syntax: scan ‘table_name’

12.Drop the table
This command permanently deletes the 'employees' table and all its data. Use it with caution.

Syntax: drop ‘table_name’

13.Exit hbase shell:

This command exits the hbase shell and returns you to your system’s command prompt. Syntax:
exit
OUTPUT:
RESULT:
Exp no:

INSTALLATION OF HIVE WITH EXAMPLES

Date:

AIM:
To install cloudera, virtualbox and implement the Hive shell commands in the
terminal.

PROCEDURE:
PRE-REQUISITES :
Cloudera

ORACLE VM virtual box

INSTALLATION PROCEDURE OF CLOUD ERA :

• Download Oracle VM Virtual Box

• Next, we need to download the Cloudera HYPERLINK

"https://drive.google.com/drive/folders/1vLg6XUSjcC3Jl78SWodVGGxbgX0RO
0_N?usp=sharing"Quickstart HYPERLINK
"https://drive.google.com/drive/folders/1vLg6XUSjcC3Jl78SWodVGGxbgX0RO
0_N?usp=sharing" VM. Version : 5.13. File size is 5.5 GB.

• Configuring Cloudera on Virtual Machine

• Once you downloaded Cloudera Quickstart VM from the above link. You will see
two files like below.

• Open the Oracle VirtualBox

• Click on File, Import Appliance

Browse your location of the file and check for the file ending with Open Virtualization
Format (.ovf)

• Click on Next once finished.

• This will be default configuration for this Virtual Machine. You can also change
the configuration and allocate according to your needs. Best is to provide Minimum
4GB for this Virtual Machine.

• Click on Import, it will take a few seconds.

• After that you will be able to see cloudera-quickstart-vm on Oracle VM VirtualBox

Manager. Right Click on it and Select Settings. Go to Network and Change
Attached from NAT to Host-Only-Adapter. Click on Ok to apply the settings.
• Click on Start to start the cloudera environment on your Virtual Machine.

• It will take a few minutes to load the Cloudera Environment on VM.

• Once it finished loading, you will be able to see this screen. That means you have
successfully installed Cloudera on Virtual Machine .

Under the hood, everything is pre-configured so that you don’t need to configure it by
yourself.
Click on Terminal to see hadoop version, hive, oozie, pig, spark-shell, HBase and
many more.

Connect Cloudera VM from your Local System

• This is made possible by HYPERLINK
"https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html"Putty
HYPERLINK "https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html"
for windows users.

• First, we need to know the IP address/Host of this Virtual Machine. Open Cloudera
Terminal and type ‘ifconfig’
ALGORITHM:

Step 1: Create a Database (if not exists)

Database name (`userdb`).

Step 2: Create a Table (if not exists)

Table name (`employee`), columns (`eid`, `name`, `salary`, `designation`), delimiters (`'\t'`,
`'\n'`), and storage location (`'/user/input'`).

Step 3: Load Data into the Table

Path to the local data file (`inputdata.txt`).

Step 4: Create a View

Input: View name (`writer_editor`), condition (`designation='Writer' OR
designation='Editor'`).

Step 5: Create an Index (with Deferred Rebuild)

Index name (`index_salary`), indexed column (`salary`), index handler
(`'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'`).

Step 6: Query Data

Retrieve and display all records from the `employee` table.
Retrieve and display records from the `writer_editor` view.
PROGRAM:

CREATE DATABASE IF NOT EXISTS userdb;

CREATE TABLE IF NOT EXISTS employee ( eid int, name String, salary String, designation String)
COMMENT 'Employee details' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/user/input';

LOAD DATA LOCAL INPATH 'inputdata.txt' OVERWRITE INTO TABLE employee;

CREATE VIEW writer_editor AS SELECT * FROM employee WHERE designation='Writer' or

designation='Editor';

CREATE INDEX index_salary ON TABLE employee(salary) AS

'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD;

SELECT * from employee;

SELECT * from writer_editor;

OUTPUT:

RESULT:

.
Ex No: 06 INSTALLATION OF HBASE, INSTALLING THRIFT ALONG
Date: WITH PRACTICE EXAMPLES

AIM:
To install HBase in windows.

PROCEDURE:

Step-1: (Extraction of files)

Extract all the files in C drive
Step-2: (Creating Folder)
Create folders named "hbase" and "zookeeper."

Step-3: (Deleting line in HBase.cmd)

Open hbase.cmd in any text editor.
Search for line %HEAP_SETTINGS% and remove it.

Step-4: (Add lines in hbase-env.cmd)

Now open hbase-env.cmd, which is in the conf folder in any text editor.
set JAVA_HOME=%JAVA_HOME%
set HBASE_CLASSPATH=%HBASE_HOME%\lib\client-facing-thirdparty\*
set HBASE_HEAPSIZE=8000
set HBASE_OPTS="-XX:+UseConcMarkSweepGC""-Djava.net.preferIPv4Stack=true"
set SERVER_GC_OPTS="-verbose:gc""-XX:+PrintGCDetails""-XX:+PrintGCDateStamps"
%HBASE_GC_OPTS%
set HBASE_USE_GC_LOGFILE=true
set HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false""-
Dcom.sun.management.jmxremote.authenticate=false"

set HBASE_MASTER_OPTS=%HBASE_JMX_BASE% "-

Dcom.sun.management.jmxremote.port=10101"
set HBASE_REGIONSERVER_OPTS=%HBASE_JMX_BASE% "-
Dcom.sun.management.jmxremote.port=10102"
set HBASE_THRIFT_OPTS=%HBASE_JMX_BASE% "-
Dcom.sun.management.jmxremote.port=10103"
set HBASE_ZOOKEEPER_OPTS=%HBASE_JMX_BASE% -
Dcom.sun.management.jmxremote.port=10104"
set HBASE_REGIONSERVERS=%HBASE_HOME%\conf\regionservers
set HBASE_LOG_DIR=%HBASE_HOME%\logs
set HBASE_IDENT_STRING=%USERNAME%
set HBASE_MANAGES_ZK=true
Step-6: (Setting Environment Variables)
Now set up the environment variables.
Search "System environment variables."

Now click on " Environment Variables."

Then click on "New."

Variable name: HBASE_HOME

Variable Value: Put the path of the Hbase folder.
We have completed the HBase Setup on Windows procedure.
Step 7: Install Apache Thrift

Download Thrift:
Visit the Apache Thrift website: https://thrift.apache.org/download.
Download and extract Thrift.
Build and Install Thrift:
./configure
make
sudo make install

Step 8: Practice Examples (Using Java with HBase and Thrift)

Below is a simple Java example demonstrating how to use Apache Thrift to interact with HBase:
import org.apache.thrift.TException;
import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import org.apache.hadoop.hbase.thrift.generated.Hbase;

public class HBaseThriftExample {

public static void main(String[] args) {

TTransport transport = new TSocket("localhost", 9090);
try {
transport.open();

// Create Thrift client

Hbase.Client client = new Hbase.Client(new TBinaryProtocol(transport));

// Perform operations
// ... add your HBase Thrift operations here ...

// Close the transport

transport.close();
} catch (TException e) {
e.printStackTrace();
}
}
}

Ensure that your HBase Thrift server is running and accessible at the specified host and port. Also,
make sure the necessary HBase Thrift libraries are included in your Java project's classpath.

The provided Java code connects to an HBase Thrift server, performs unspecified operations (indicated
by comments), and handles exceptions. Since the actual operations are not specified in the code, the
output would depend on what operations you perform within the try block.

If everything runs successfully (meaning the HBase Thrift server is running and reachable, and your
operations execute without errors), the program will terminate without any output.

RESULT:
Ex.No: 07 PRACTICE IMPORTING AND EXPORTING DATA FROM
Date: VARIOUS DATABASES

AIM:
To perform importing and exporting data from various databases.
Such as HDFS, Apache Hive and Apache spark

PROCEDURE:

Importing Data:

1. Hadoop Distributed File System (HDFS):

• Use the Hadoop hdfs dfs command-line tool or Hadoop File System API to copy data from a
local file system or another location to HDFS. For example:

S hdfs dfs -put local_file.txt /hdfs/path

 This command uploads the local_file.txt from the local file system to the HDFS path /hdfs/path.

2. Apache Hive:

• Hive supports data import from various sources, including local files, HDFS, and databases.
You can use the LOAD DATA statement to import data into Hive tables. For example:

LOAD DATA INPATH '/hdfs/path/data.txt' INTO TABLE my_table;

• This statement loads data from the HDFS path /hdfs/path/data.txt into the Hive table my_table.

3. Apache Spark:

 Spark provides rich APIs for data ingestion. You can use th DataFrameReader or SparkSession
APIs to read data from different source such as CSV files, databases, or streaming systems. For
example:

val df = spark.read.format("esv").load("/path/to/data.csv")

 This code reads data from the CSV file located at /path/to/data.csv inte DataFrame in Spark.
Exporting Data:

1. Hadoop Distributed File System (HDFS):

• Use the Hadoop hdfs dfs command-line tool or Hadoop File System AP copy data from HDFS
to a local file system or another location. For example:

S hdfs dfs -get/hdfs/path/file.txt local_file.txt

• This command downloads the file /hdfs/path/file.txt from HDFS and saves it as local file.txt in
the local file system.

2. Apache Hive:

• Exporting data from Hive can be done in various ways, depending on the desired output format.
You can use the INSERT OVERWRITE statement to export data from Hive tables to files or
other Hive tables. For example:

INSERT OVERWRITE LOCAL DIRECTORY '/path/to/output SELECT FROM my_table;

• This statement exports the data from the table Hive table to the local directory /path/to/output.

3. Apache Spark:

• Spark provides flexible options for data export. You can use theDataFrame Writer or Dataset
Writer APIs to write data to different file formats, databases, or streaming systems. For example:

df.write.format("parquet").save("/path/to/output")

• This code saves the DataFrame df in Parquet format to the specified output directory.

RESULT:

K250 Service Manual Part 2 1987
No ratings yet
K250 Service Manual Part 2 1987
70 pages
Termometro IR LUTRON-TM-969-BROCHURE
No ratings yet
Termometro IR LUTRON-TM-969-BROCHURE
2 pages
All
No ratings yet
All
11 pages
big_data_lab[1]
No ratings yet
big_data_lab[1]
52 pages
BDAV Practical
No ratings yet
BDAV Practical
17 pages
22MCC20017_Suraj_Kumar_Thakur_BIG_Data_2.1
No ratings yet
22MCC20017_Suraj_Kumar_Thakur_BIG_Data_2.1
7 pages
MapReduce Programs
No ratings yet
MapReduce Programs
10 pages
1to8
No ratings yet
1to8
16 pages
Exp 9 - Merged
No ratings yet
Exp 9 - Merged
13 pages
Palak
No ratings yet
Palak
10 pages
BDC Output 3
No ratings yet
BDC Output 3
4 pages
Big Data Lab
No ratings yet
Big Data Lab
12 pages
exp5bdafinal
No ratings yet
exp5bdafinal
7 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Dsa Prac 5 19DCS038
No ratings yet
Dsa Prac 5 19DCS038
16 pages
Part B Assignment - No - 1
No ratings yet
Part B Assignment - No - 1
6 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
34 pages
BDA Lab 8 Manual
No ratings yet
BDA Lab 8 Manual
7 pages
MapReduce - Notes
No ratings yet
MapReduce - Notes
17 pages
4matrix
No ratings yet
4matrix
2 pages
BDA
No ratings yet
BDA
19 pages
BDA Exp5
No ratings yet
BDA Exp5
12 pages
Customer - 3.java: Import Import Import Import Import Import Import Import
No ratings yet
Customer - 3.java: Import Import Import Import Import Import Import Import
15 pages
Exp 9
No ratings yet
Exp 9
7 pages
Hadoop Wordcount Program
No ratings yet
Hadoop Wordcount Program
20 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Matrix Multiply
No ratings yet
Matrix Multiply
3 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
MR Progs For Self Excercise
No ratings yet
MR Progs For Self Excercise
14 pages
sets_bda
No ratings yet
sets_bda
19 pages
SplitPDFFile 1 To 7
No ratings yet
SplitPDFFile 1 To 7
7 pages
2020300053_BDA_EXP2_CHINMAY
No ratings yet
2020300053_BDA_EXP2_CHINMAY
7 pages
wrordcount
No ratings yet
wrordcount
2 pages
17. Using Map Reduce Concept, Implement a Java Pro...
No ratings yet
17. Using Map Reduce Concept, Implement a Java Pro...
2 pages
Core Java Programming Book
From Everand
Core Java Programming Book
Manish Soni
No ratings yet
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
Mcsl26 See QP Solution 2024
No ratings yet
Mcsl26 See QP Solution 2024
33 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
Classcreation
No ratings yet
Classcreation
2 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
Big Data All Kumar
No ratings yet
Big Data All Kumar
24 pages
Big Data 4 Vivek
No ratings yet
Big Data 4 Vivek
3 pages
B1 instructions
No ratings yet
B1 instructions
9 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
59 pages
bd-2lab
No ratings yet
bd-2lab
7 pages
3 MapReduce program ex code
No ratings yet
3 MapReduce program ex code
14 pages
BDA MapReduce Program (1)
No ratings yet
BDA MapReduce Program (1)
8 pages
bda-lab_2&6
No ratings yet
bda-lab_2&6
6 pages
Practical 2-3
No ratings yet
Practical 2-3
3 pages
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
Big Data Manual
No ratings yet
Big Data Manual
82 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Hadoop Mini Project
No ratings yet
Hadoop Mini Project
8 pages
CSF443 Lab-Report Nimish Shandilya 1000016934
No ratings yet
CSF443 Lab-Report Nimish Shandilya 1000016934
17 pages
Hadoop Mapred
100% (1)
Hadoop Mapred
11 pages
Java CustomWritables
No ratings yet
Java CustomWritables
6 pages
Bdt Lab 6 22mis1067
No ratings yet
Bdt Lab 6 22mis1067
13 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
Como Desarrollar La Inteligencia Emocional Torrabadella Paz PDF
No ratings yet
Como Desarrollar La Inteligencia Emocional Torrabadella Paz PDF
5 pages
Installation Instructions: Keep This Information in The Glove Box For Future Reference
No ratings yet
Installation Instructions: Keep This Information in The Glove Box For Future Reference
4 pages
Walvis Bay, Namibia: Medium-Voltage Solution To Erongo RED's Paratus Substation
No ratings yet
Walvis Bay, Namibia: Medium-Voltage Solution To Erongo RED's Paratus Substation
2 pages
Shivam Training
No ratings yet
Shivam Training
25 pages
Danielle Sullivan Resume
No ratings yet
Danielle Sullivan Resume
1 page
Jointed IEC Test Finger
No ratings yet
Jointed IEC Test Finger
5 pages
Practical Applications of Data Integrity and Audit Trail
No ratings yet
Practical Applications of Data Integrity and Audit Trail
42 pages
Chapter 2 Linux File System
No ratings yet
Chapter 2 Linux File System
91 pages
Odl Price List of Nov
No ratings yet
Odl Price List of Nov
63 pages
Saral: Compact, Accurate and Reliable
No ratings yet
Saral: Compact, Accurate and Reliable
2 pages
Reading test 1_ 4th.
No ratings yet
Reading test 1_ 4th.
3 pages
Data Structure Objective Questions
75% (4)
Data Structure Objective Questions
21 pages
NANOROBOTICS
No ratings yet
NANOROBOTICS
29 pages
Digital Signature Forming and Keys Protection Base PDF
No ratings yet
Digital Signature Forming and Keys Protection Base PDF
7 pages
P51 UC 1.3 Spare Parts
No ratings yet
P51 UC 1.3 Spare Parts
28 pages
Water Takeoff Performance Calculation Method For A
No ratings yet
Water Takeoff Performance Calculation Method For A
13 pages
Jan Feb 2023
No ratings yet
Jan Feb 2023
2 pages
ECP5 and ECP5-5G Memory Usage Guide: November 2015 Technical Note TN1264
No ratings yet
ECP5 and ECP5-5G Memory Usage Guide: November 2015 Technical Note TN1264
58 pages
E BROCHURE Paramount Plastic
No ratings yet
E BROCHURE Paramount Plastic
12 pages
Service Refund Coupon: Coupon Number: 21101603044514 Coupon Date: 16-Oct-2021
No ratings yet
Service Refund Coupon: Coupon Number: 21101603044514 Coupon Date: 16-Oct-2021
1 page
Validations I
No ratings yet
Validations I
4 pages
Development of Refrigeration Oil For Use With R32
No ratings yet
Development of Refrigeration Oil For Use With R32
7 pages
IR User Guide
No ratings yet
IR User Guide
10 pages
Manual For Oil Bunker Flow System
No ratings yet
Manual For Oil Bunker Flow System
9 pages
TVT474.Digital June 2022
No ratings yet
TVT474.Digital June 2022
38 pages
Kki Catalogue 2023
No ratings yet
Kki Catalogue 2023
104 pages
FinalAgenda-UIC
No ratings yet
FinalAgenda-UIC
5 pages
4 History of HTML
No ratings yet
4 History of HTML
2 pages

BDA Exp Removed Removed

Uploaded by

BDA Exp Removed Removed

Uploaded by

Exp no: WORD COUNT MAP REDUCE PROGRAM TO UNDERSTAND

MAP REDUCE PARADIGM

public class wordCount {

public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

public void reduce(Text key, Iterable <IntWritable> values, Context context)

Job job = new Job(conf, "wordcount"); job.setOutputKeyClass(Text.class);

Implement of Matrix Multiplication with Hadoop Map Reduce

ALGORITHM FOR MAP FUNCTION :

ALGORITHM FOR REDUCE FUNCTION :

module load hadoop/2.6.0

HBASE PRACTICE EXAMPLES

1.Start hbase shell:

3.Create a new table:

4.Insert the rows into tables

5.Describe the table

6.Disable the table

7.Enable the table

8.Alter the table

9.Count rows in table

10.Check if the table exists

11.View the table

Syntax: scan ‘table_name’

Syntax: drop ‘table_name’

13.Exit hbase shell:

INSTALLATION OF HIVE WITH EXAMPLES

ORACLE VM virtual box

INSTALLATION PROCEDURE OF CLOUD ERA :

• Next, we need to download the Cloudera HYPERLINK

• Configuring Cloudera on Virtual Machine

• Open the Oracle VirtualBox

• Click on File, Import Appliance

• Click on Next once finished.

• Click on Import, it will take a few seconds.

• After that you will be able to see cloudera-quickstart-vm on Oracle VM VirtualBox

• It will take a few minutes to load the Cloudera Environment on VM.

Connect Cloudera VM from your Local System

Step 1: Create a Database (if not exists)

Step 2: Create a Table (if not exists)

Step 3: Load Data into the Table

Step 4: Create a View

Step 5: Create an Index (with Deferred Rebuild)

Step 6: Query Data

CREATE DATABASE IF NOT EXISTS userdb;

LOAD DATA LOCAL INPATH 'inputdata.txt' OVERWRITE INTO TABLE employee;

CREATE VIEW writer_editor AS SELECT * FROM employee WHERE designation='Writer' or

CREATE INDEX index_salary ON TABLE employee(salary) AS

SELECT * from employee;

SELECT * from writer_editor;

Step-1: (Extraction of files)

Step-3: (Deleting line in HBase.cmd)

Step-4: (Add lines in hbase-env.cmd)

set HBASE_MASTER_OPTS=%HBASE_JMX_BASE% "-

Now click on " Environment Variables."

Variable name: HBASE_HOME

Step 8: Practice Examples (Using Java with HBase and Thrift)

public class HBaseThriftExample {

public static void main(String[] args) {

// Create Thrift client

// Close the transport

1. Hadoop Distributed File System (HDFS):

S hdfs dfs -put local_file.txt /hdfs/path

LOAD DATA INPATH '/hdfs/path/data.txt' INTO TABLE my_table;

1. Hadoop Distributed File System (HDFS):

S hdfs dfs -get/hdfs/path/file.txt local_file.txt

INSERT OVERWRITE LOCAL DIRECTORY '/path/to/output SELECT FROM my_table;

You might also like