0% found this document useful (0 votes)
25 views

Project Proposal - HDFS

Hadoop is a tool for big data computing that relies on its distributed file system HDFS. HDFS is fault tolerant, uses low-cost hardware, and provides high throughput access to large datasets. This project involves examining the open source Hadoop code to understand how HDFS operations like open, read, seek, write, and their security are implemented at both the client and server levels.

Uploaded by

ravelstein
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Project Proposal - HDFS

Hadoop is a tool for big data computing that relies on its distributed file system HDFS. HDFS is fault tolerant, uses low-cost hardware, and provides high throughput access to large datasets. This project involves examining the open source Hadoop code to understand how HDFS operations like open, read, seek, write, and their security are implemented at both the client and server levels.

Uploaded by

ravelstein
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 1

1Introduction

HadoopisanindispensabletoolforBigDatacomputing.Likeanyotherdistributed
system, the success of its operation is augmented by its distributed file system
architectureknownasHDFS. HDFS ishighlyfaulttolerantandisdesignedtobe
deployed on lowcost hardware. HDFS provides high throughput access to
applicationdataandissuitableforapplicationsthathavelargedatasets.

2Requirements
ThisprojectwillinvolveWalkthroughoftheopensourceHadoopsourcecodeto
understand and illustrate following HDFS operations. Each file system operation
listedbelowwilllistfunctionsandlibrariescalledonboththeclientandtheserver
whenafileoperationoccurs.
1) Open.
2) Read.
3) Seek
4) Write
5) SecurityofFilesforoperation14.

References
1. HDFSsourcecode:http://hadoop.apache.org/hdfs/version_control.html
2HDFSJavaAPI:http://hadoop.apache.org/core/docs/current/api/

You might also like