Big Data Intro
Big Data Intro
Sources of data
Mere data from internet
Data from military corporations
Hospitals data
NASA corporation data
And so on…
Types of data
Unstructured data
Like images , videos , social media data
Semi structured data
Like xml files
Structured data
Like data base , SQL servers
What is the problem ?!
The problem is that with this un sorted very large data size
, we cant analysis it, more over we cant classify it ,, it
become un-useful stored data without any usage
Fault tolerance
Hadoop ensures that there is backup for every block and
there is more than one copy of each block among nodes
cluster
It provides single write and multiple read for data
PIG , HIVE, ZOOKEEPER
They are already built projects dedicated for special type
of jobs
For example pig is used for data base projects
It supports any language to write your own MapReduce