Hdfs r20it III
Hdfs r20it III
Introduction to Hadoop:
History of Hadoop,
Hadoop Distributed File System,
Components of Hadoop
Analysing the Data with Hadoop,
Scaling Out,
Design of HDFS,
Java interfaces to HDFS Basics,
2
12/06/2024
HADOOP IS SCALE UP OR SCALE OUT??
Faster servers, more memory and powerful
processors
Adding Nodes for parallel computing
3
hadoop
HDFS MAP-REDUCE
12/06/2024
Hadoop Distributed File
System
Apache projects
HDFS BUILDING BLOCKS
1.NAME NODE 4
2. SECONDARY NAME NODE
3. DATA NODE
4. BLOCK SIZE
5. RESOURCE MANAGER (JOB
TRACKER)
6. NODE MANAGER (TASK TRACKER)
Name node
Job tracker
DATA NODE
SLAVE DEAMONS
TASK
TRACKER
5 12/06/2024
BUILDING BLOCKS OF HADOOP
6
12/06/2024
HDFS Architecture
7
Metadata(Name, replicas..)
Metadata ops Namenode (/home/foo/data,6. ..
Client
Block ops
Read Datanodes Datanodes
replication
B
Blocks
Client
12/06/2024
Fault tolerance
8
12/06/2024
12
12/06/2024
FILESYSTEM IMAGE and EditLogs
13
Master/slave architecture
HDFS cluster consists of a single Namenode, a master server
that manages the file system namespace and regulates access to
files by clients.
There are a number of DataNodes usually one per node in a
cluster.
The DataNodes manage storage attached to the nodes that they
run on.
HDFS exposes a file system namespace and allows user data to be
stored in files.
A file is split into one or more blocks and set of blocks are stored
in DataNodes.
DataNodes: serves read, write requests, performs block creation,
deletion, and replication upon instruction from Namenode.
12/06/2024
NAME NODE
15
12/06/2024
File system Namespace
16
12/06/2024
Datanode
19
12/06/2024