HBASE

HBase is a distributed, column-oriented database built on Hadoop, providing quick random access to large amounts of structured data. It is schema-less, horizontally scalable, and integrates with Hadoop, supporting real-time read/write operations. HBase is utilized in applications requiring fast data access and is used by major companies like Facebook and Twitter.

Uploaded by

Kavvya Mridul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views18 pages

HBASE

Uploaded by

Kavvya Mridul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

HBASE

CONTENT
◦ Hbase: Hbasics
◦ Concepts
◦ Clients
◦ Example
◦ Hbase Versus RDBMS.
◦ Big SQL : Introduction
Limitations of Hadoop

◦ Hadoop can perform only batch processing, and data will be accessed only in a
sequential manner. That means one has to search the entire dataset even for the
simplest of jobs.
◦ A huge dataset when processed results in another huge data set, which should also
be processed sequentially. At this point, a new solution is needed to access any
point of data in a single unit of time (random access).
Hadoop Random Access Databases

◦ Applications such as HBase, Cassandra, couchDB, Dynamo, and MongoDB are some
of the databases that store huge amounts of data and access the data in a random
manner.
What is HBase?

◦ HBase is a distributed column-oriented database built on top of the Hadoop file

system. It is an open-source project and is horizontally scalable.
◦ HBase is a data model that is similar to Google’s big table designed to provide
quick random access to huge amounts of structured data. It leverages the fault
tolerance provided by the Hadoop File System (HDFS).
◦ It is a part of the Hadoop ecosystem that provides random real-time read/write
access to data in the Hadoop File System.
◦ One can store the data in HDFS either directly or through HBase. Data consumer
reads/accesses the data in HDFS randomly using HBase. HBase sits on top of the
Hadoop File System and provides read and write access.
Storage Mechanism in HBase

◦ HBase is a column-oriented database and the tables in it are sorted by row. The
table schema defines only column families, which are the key value pairs. A table
have multiple column families and each column family can have any number of
columns. Subsequent column values are stored contiguously on the disk. Each cell
value of the table has a timestamp. In short, in an HBase:
◦ Table is a collection of rows.
◦ Row is a collection of column families.
◦ Column family is a collection of columns.
◦ Column is a collection of key value pairs.
Given below is an example
schema of table in HBase.
HBase and RDBMS

HBase RDBMS
HBase is schema-less, it doesn't An RDBMS is governed by its
have the concept of fixed columns schema, which describes the whole
schema; defines only column structure of tables.
families.
It is built for wide tables. HBase is It is thin and built for small tables.
horizontally scalable. Hard to scale.
No transactions are there in HBase. RDBMS is transactional.
It has de-normalized data. It will have normalized data.
It is good for semi-structured as It is good for structured data.
well as structured data.
Features of HBase

◦ HBase is linearly scalable.

◦ It has automatic failure support.
◦ It provides consistent read and writes.
◦ It integrates with Hadoop, both as a source and a destination.
◦ It has easy java API for client.
◦ It provides data replication across clusters.
Where to Use HBase

◦ Apache HBase is used to have random, real-time read/write access to Big Data.
◦ It hosts very large tables on top of clusters of commodity hardware.
◦ Apache HBase is a non-relational database modeled after Google's Bigtable.
Bigtable acts up on Google File System, likewise Apache HBase works on top of
Hadoop and HDFS.
Applications of HBase

◦ It is used whenever there is a need to write heavy applications.

◦ HBase is used whenever we need to provide fast random access to available data.
◦ Companies such as Facebook, Twitter, Yahoo, and Adobe use HBase internally.
HBase - Architecture

◦ In HBase, tables are split into regions and are served by the region servers. Regions
are vertically divided by column families into “Stores”. Stores are saved as files in
HDFS. Shown below is the architecture of HBase.
◦ Note: The term ‘store’ is used for regions to explain the storage structure.
HBase - Architecture

◦ HBase has three major components: the client library, a master server, and region
servers. Region servers can be added or removed as per requirement.
MasterServer

◦ The master server -

◦ Assigns regions to the region servers and takes the help of Apache ZooKeeper for
this task.
◦ Handles load balancing of the regions across region servers. It unloads the busy
servers and shifts the regions to less occupied servers.
◦ Maintains the state of the cluster by negotiating the load balancing.
◦ Is responsible for schema changes and other metadata operations such as creation
of tables and column families.
Regions

◦ Regions are nothing but tables that are split up and spread across the region
servers.
◦ Region server
◦ The region servers have regions that -
◦ Communicate with the client and handle data-related operations.
◦ Handle read and write requests for all the regions under it.
◦ Decide the size of the region by following the region size thresholds.
◦ When we take a deeper look into the region server, it contain regions and stores as
shown below:
The store contains memory store and HFiles. Memstore is just like a cache
memory. Anything that is entered into the HBase is stored here initially.
Later, the data is transferred and saved in Hfiles as blocks and the memstore
is flushed.
Zookeeper

◦ Zookeeper is an open-source project that provides services like maintaining configuration

information, naming, providing distributed synchronization, etc.
◦ Zookeeper has ephemeral nodes representing different region servers. Master servers use
these nodes to discover available servers.
◦ In addition to availability, the nodes are also used to track server failures or network
partitions.
◦ Clients communicate with region servers via zookeeper.
◦ In pseudo and standalone modes, HBase itself will take care of zookeeper.

Introduction to Apache HBase Basics
No ratings yet
Introduction to Apache HBase Basics
19 pages
Ba Iift 17-18
No ratings yet
Ba Iift 17-18
40 pages
Hbase - Quick Guide Hbase - Overview
No ratings yet
Hbase - Quick Guide Hbase - Overview
53 pages
10 HBase
No ratings yet
10 HBase
13 pages
HBase - Tutorial
No ratings yet
HBase - Tutorial
14 pages
HBase: Key Features and Architecture
No ratings yet
HBase: Key Features and Architecture
31 pages
Bda - Unit 5
No ratings yet
Bda - Unit 5
30 pages
HBase
No ratings yet
HBase
6 pages
HBase
No ratings yet
HBase
27 pages
Unit 1 P2 HBase
No ratings yet
Unit 1 P2 HBase
22 pages
Hadoop HBASE
No ratings yet
Hadoop HBASE
71 pages
HBase Overview and Architecture Guide
No ratings yet
HBase Overview and Architecture Guide
34 pages
CCS334 BDA - Unit 5
No ratings yet
CCS334 BDA - Unit 5
27 pages
BDA Unit 5
No ratings yet
BDA Unit 5
33 pages
HBase Overview: Big Data Database Insights
No ratings yet
HBase Overview: Big Data Database Insights
11 pages
HBase: Scalable NoSQL Database Overview
No ratings yet
HBase: Scalable NoSQL Database Overview
32 pages
Unit 5 Lecture No-3 (Hbase)
No ratings yet
Unit 5 Lecture No-3 (Hbase)
35 pages
Unit 5 Lecture No-3 (Hbase)
No ratings yet
Unit 5 Lecture No-3 (Hbase)
35 pages
Big Data Analytics Unit-5
No ratings yet
Big Data Analytics Unit-5
28 pages
HBase: Managing Big Data Efficiently
No ratings yet
HBase: Managing Big Data Efficiently
9 pages
NoteGPT - What Is HBase - HBase Architecture - HBase Tutorial For Beginners - Hadoop Tutorial - Simplilearn
No ratings yet
NoteGPT - What Is HBase - HBase Architecture - HBase Tutorial For Beginners - Hadoop Tutorial - Simplilearn
5 pages
HBase Architecture & Features Guide
No ratings yet
HBase Architecture & Features Guide
35 pages
HBase Architecture and Its Important Components
No ratings yet
HBase Architecture and Its Important Components
11 pages
Hadoop Week 6
No ratings yet
Hadoop Week 6
38 pages
Unit 5 Hbase
No ratings yet
Unit 5 Hbase
15 pages
Hbase
No ratings yet
Hbase
23 pages
BDA Unit-5
No ratings yet
BDA Unit-5
31 pages
HBase Presentation
No ratings yet
HBase Presentation
23 pages
HBase: Data Management & Architecture
100% (1)
HBase: Data Management & Architecture
36 pages
HBase Overview: Data Model & Clients
No ratings yet
HBase Overview: Data Model & Clients
34 pages
HBase: Scalable NoSQL Database Overview
No ratings yet
HBase: Scalable NoSQL Database Overview
8 pages
Hbase What Is Hbase?
No ratings yet
Hbase What Is Hbase?
2 pages
Overview of HBase Database Features
No ratings yet
Overview of HBase Database Features
15 pages
Hbase
No ratings yet
Hbase
3 pages
Unit 5 Bda
No ratings yet
Unit 5 Bda
42 pages
Lec 18
No ratings yet
Lec 18
18 pages
BDA Unit 5 HIVE HBASE
No ratings yet
BDA Unit 5 HIVE HBASE
33 pages
Apache HBase
No ratings yet
Apache HBase
12 pages
HBase vs HDFS: Key Limitations Explained
No ratings yet
HBase vs HDFS: Key Limitations Explained
15 pages
BDA1
No ratings yet
BDA1
42 pages
NoSQL Databases for Tech Enthusiasts
No ratings yet
NoSQL Databases for Tech Enthusiasts
74 pages
Adobe Scan 06-Aug-2025
No ratings yet
Adobe Scan 06-Aug-2025
9 pages
HBase Architecture and Design Overview
No ratings yet
HBase Architecture and Design Overview
21 pages
HBase: A Key-Value NoSQL Database
100% (1)
HBase: A Key-Value NoSQL Database
47 pages
Big Data UNIT 5 Own
No ratings yet
Big Data UNIT 5 Own
18 pages
HBase: Scalable Big Data Database Overview
100% (1)
HBase: Scalable Big Data Database Overview
30 pages
HBase Overview: Architecture & Use Cases
No ratings yet
HBase Overview: Architecture & Use Cases
17 pages
HBase Architecture
No ratings yet
HBase Architecture
1 page
HBase
No ratings yet
HBase
12 pages
Chapter 12 HBase
No ratings yet
Chapter 12 HBase
108 pages
HBase
No ratings yet
HBase
39 pages
HBase
No ratings yet
HBase
14 pages
Bda Unit 5
No ratings yet
Bda Unit 5
16 pages
HBase: Overview and Architecture Guide
No ratings yet
HBase: Overview and Architecture Guide
18 pages
Understanding HBase: Architecture & Features
No ratings yet
Understanding HBase: Architecture & Features
15 pages
HBase Shell Commands for Database Setup
No ratings yet
HBase Shell Commands for Database Setup
74 pages
HBase NoSQL Database Overview
No ratings yet
HBase NoSQL Database Overview
9 pages
HBase Overview and Architecture Guide
No ratings yet
HBase Overview and Architecture Guide
37 pages
Multi-Cloud & DevOps Training
No ratings yet
Multi-Cloud & DevOps Training
16 pages
Bank Management System Project HTML Report
No ratings yet
Bank Management System Project HTML Report
13 pages
Linux Sysadmin Command Reference
No ratings yet
Linux Sysadmin Command Reference
2 pages
Data Frame
No ratings yet
Data Frame
17 pages
NDEM Brochure V3
No ratings yet
NDEM Brochure V3
20 pages
Expel AWS Defenders Cheat Sheet 111020
No ratings yet
Expel AWS Defenders Cheat Sheet 111020
13 pages
InfiniBox Oracle Database Integration and Best Practices
No ratings yet
InfiniBox Oracle Database Integration and Best Practices
37 pages
Delphi Informant 95 2001
No ratings yet
Delphi Informant 95 2001
45 pages
Exam 1Z0-071: IT Certification Guaranteed, The Easy Way!
100% (5)
Exam 1Z0-071: IT Certification Guaranteed, The Easy Way!
101 pages
Big Data Textbook Unit1
No ratings yet
Big Data Textbook Unit1
13 pages
Relational Algebra and Database Keys
No ratings yet
Relational Algebra and Database Keys
14 pages
MSAB XAMN Pro Advanced Analyst Course (English) PDF
No ratings yet
MSAB XAMN Pro Advanced Analyst Course (English) PDF
1 page
Online Cake Shop: Bachelor of Science (Information Technology)
No ratings yet
Online Cake Shop: Bachelor of Science (Information Technology)
78 pages
3 Hours / 70 Marks: Seat No
No ratings yet
3 Hours / 70 Marks: Seat No
1 page
Manual AA Gateway Agent
No ratings yet
Manual AA Gateway Agent
30 pages
Python Data Analysis Programs Guide
No ratings yet
Python Data Analysis Programs Guide
2 pages
Single Level Indexing
No ratings yet
Single Level Indexing
20 pages
AWS Outposts Partner Enablement Guide
No ratings yet
AWS Outposts Partner Enablement Guide
21 pages
Roll No: ISWK Series P-II: Computer Science
No ratings yet
Roll No: ISWK Series P-II: Computer Science
9 pages
Project
No ratings yet
Project
13 pages
7.3 Planning A Database
No ratings yet
7.3 Planning A Database
8 pages
DAX Language Guide for Power BI
100% (2)
DAX Language Guide for Power BI
309 pages
Quality Integration Setup Guide
No ratings yet
Quality Integration Setup Guide
12 pages
OptiMaint Installation & Update Guide
No ratings yet
OptiMaint Installation & Update Guide
20 pages
Understanding Fields, Records, and Files
No ratings yet
Understanding Fields, Records, and Files
4 pages
Chapter - 2 - Data Warehouse Modelling
No ratings yet
Chapter - 2 - Data Warehouse Modelling
32 pages
Enhanced Library Management System Report
No ratings yet
Enhanced Library Management System Report
156 pages
MySQL Performance Schema Tables Overview
No ratings yet
MySQL Performance Schema Tables Overview
6 pages
Coatron: Semi-Automated Coagulation Analyzer Series
No ratings yet
Coatron: Semi-Automated Coagulation Analyzer Series
7 pages
Machine Learning Foundations Course
No ratings yet
Machine Learning Foundations Course
4 pages

HBASE

Uploaded by

HBASE

Uploaded by

HBASE

◦ HBase is a distributed column-oriented database built on top of the Hadoop file

◦ HBase is linearly scalable.

◦ It is used whenever there is a need to write heavy applications.

◦ The master server -

◦ Zookeeper is an open-source project that provides services like maintaining configuration

You might also like