0% found this document useful (0 votes)
151 views

Hbase PDF

The document describes how to: 1. Create an HBase table and load sample data. 2. Create an external Hive table that maps to an existing HBase table to sync data between the two. 3. Loading data from HDFS into HBase and syncing it with the external Hive table. 4. Making updates to data in HBase and seeing the changes sync to Hive.

Uploaded by

chandra reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
151 views

Hbase PDF

The document describes how to: 1. Create an HBase table and load sample data. 2. Create an external Hive table that maps to an existing HBase table to sync data between the two. 3. Loading data from HDFS into HBase and syncing it with the external Hive table. 4. Making updates to data in HBase and seeing the changes sync to Hive.

Uploaded by

chandra reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

############ HBASE ###################

Note: for clearing screen => system("clear")

Note: commands are case sensitive

Step1:Create table in Hbase named emp

create 'emp','details'

Step2: insert data in emp table

put 'emp','Row1','details:ename','King'

put 'emp','Row1','details:esal','24000'

put 'emp','Row1','details:loc','Hyd'

put 'emp','Row2','details:ename','Steven'

put 'emp','Row2','details:esal','14000'

put 'emp','Row2','details:loc','Delhi'

Step3: display the data from emp table

scan 'emp'

Step4: display only second row from emp table

get 'emp','Row2'

Step5: Update Row1 location to Chennai

put 'emp','Row1','details:loc','Chennai'

get 'emp','Row1'
Step6: Describe the table emp and notice the table status to be
ENABLED

describe 'emp'

Step7: To drop the table we need to disable the tabel emp

disable 'emp'

Step8: Drop the table emp

drop 'emp'

Step9: Display the list of tables in hbase

list
************HDFS FILE TO Hbase *************

Step1:Create a table in Hbase based on Empexport.csv file data

create 'employees','name','contact','job','wages','work'

list

Step2: load the Empexport.csv file from desktop to Hdfs

hadoop fs -put Desktop/Empexport.csv

hadoop fs -ls

Step3: Map the columns from Empexport.csv to column_family of hbase


employees table by calling the hadoop jars hbase.jar & guava-11.0.2.jar

Note: make sure you type the below statement in one line only

hadoop jar /usr/lib/hbase/hbase.jar importtsv -libjars /usr/lib/hbase/lib/guava-


11.0.2.jar '-Dimporttsv.separator=,' -
Dimporttsv.columns=HBASE_ROW_KEY,name:fname,name:lname,contact:email
,contact:phone,job:hire_date,job:job_id,wages:salary,wages:commission_pct,w
ork:manager_id,work:department_id employees meg_dir1/Empexport.csv
********************************************************

Creating table in Hive that automatically creates table in Hbase and this table
sync's data insert,update,delete can only happen on this table from Hbase side

********************************************************

Step1: Connect to Hive

Step2: Create table customer in Hive database xora_db

create table customer(key string,country string,state string) stored by


'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
('hbase.columns.mapping' = ':key,add:country,add:state') TBLPROPERTIES
('hbase.table.name' = 'customer');

Step3: Select * from customer;

Step4: Connect to Hbase , you will see customer table created


automatically as you created a Internal/Managed table in Hive to be
created in Hbase as well

list

scan 'customer'

Step5: Add a row in Hbase customer table

put 'customer','901','add:country','Germany'

put 'customer','901','add:state','Stuttgart'

scan 'customer'
Step6: Connect to Hive and see data for 901 added in Hbase has synced
in Hive table customer

select * from customer;

Step7: Load file custH.csv from Hdfs to Hbase table customer and see it
syncing in Hive

hadoop fs -put Desktop/custH.csv input_dir1

hadoop jar /usr/lib/hbase/hbase.jar importtsv -libjars /usr/lib/hbase/lib/guava-


11.0.2.jar '-Dimporttsv.separator=,' -
Dimporttsv.columns=HBASE_ROW_KEY,add:country,add:state customer
input_dir1/custH.csv

Step8: In Hbase terminal

scan 'customer'

Step9: In Hive terminal

select * from customer;

Step10: Delete cell value of column country from Hbase customer table
for row 901

delete 'customer','901','add:country'

get 'customer','901'

Step11: On Hive terminal check the delete effect

select * from customer;


Step12: Update cell value of column state from Hbase customer table for
row 101

put 'customer','101','add:state','MH'

get 'customer','101'

Step13: On Hive terminal check the update effect

select * from customer;

Step14: Drop customer table from Hive terminal , it also be dropped


from Hbase

Drop table customer;

Step15: Check the table in Hbase for existance, it wont be there as it got
automatically drop because of its drop from Hive

list
Syncing Hive External tables with a pre-existing table in Hbase

Step1: Create table employees in Hbase and load it with data from a file
in HDFS

Refer **** Hdfs file to Hbase section ******

Step2: On Hbase terminal

list

scan 'employees'

Step3: Connect to Hive and create a hbase_demo database

create database hbase_demo;

show databases;

use hbase_demo;

Step4: In Hive create a External table emp to point at Hbase employees


table

create external table emp(empid int, fname string, lname string,email


string,phone string,hire_date string,job_id string,salary float,commission_pct
float,manager_id int,department_id int) stored by
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
('hbase.columns.mapping' =
':key,name:fname,name:lname,contact:email,contact:phone,job:hire_date,job:j
ob_id,wages:salary,wages:commission_pct,work:manager_id,work:department_
id') TBLPROPERTIES ('hbase.table.name' = 'employees');

desc formatted emp;


Step5: Display the data from emp hive table, it will show data from
Hbase

select * from emp;

Step6: Update job_id of employee 100 in Hbase table

put 'employees','100','job:job_id','President'

get 'employees','100'

Step7: Drop the emp table in Hive the table will only be drop from Hive ,
Hbase employees table wiil be unaffected

Drop table emp;

Step8: If we drop employees table in Hbase it gets dropped and Hive


emp table doesnot get affected but when we select from emp Hive table
it throws error as it is unable to find employees table in Hbase

You might also like