0% found this document useful (0 votes)
9 views

Big Data Analytics

Big data analytics describes the process of analyzing large amounts of raw data to uncover trends and patterns to help make informed decisions. There are several types of big data analytics including predictive, descriptive, prescriptive, and diagnostic. Big data started gaining popularity in the early 1990s and refers to data that is too large for traditional data processing systems to handle. Common examples of big data sources include data generated from smartphones, social media, websites, and sensors in IoT devices. The 5 V's - volume, velocity, variety, veracity, and value - are used to classify if data can be considered big data. Hadoop is an open source framework commonly used to store and process big data across clusters of computers in a distributed manner

Uploaded by

Aravintha
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Big Data Analytics

Big data analytics describes the process of analyzing large amounts of raw data to uncover trends and patterns to help make informed decisions. There are several types of big data analytics including predictive, descriptive, prescriptive, and diagnostic. Big data started gaining popularity in the early 1990s and refers to data that is too large for traditional data processing systems to handle. Common examples of big data sources include data generated from smartphones, social media, websites, and sensors in IoT devices. The 5 V's - volume, velocity, variety, veracity, and value - are used to classify if data can be considered big data. Hadoop is an open source framework commonly used to store and process big data across clusters of computers in a distributed manner

Uploaded by

Aravintha
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Big Data Analytics

What is Big Data analytics ?


It describes the process of uncovering trends, patterns, and
correlations in large amounts of raw data to help make data-informed
decisions

What are the types of Big Data analytics?


• Predictive (forecasting)
• Descriptive (Business intelligence and data mining)
• PreScriptive(Optimization and Simulation)
• Diagnastic analystics
When did big data Start?
It start in early 1990’s

It is not exactly known who first used the term

Most people credits goes to John R. Mashey an american computer


scientist (Who at the time worked at the Sillicon Graphics) for making
the term popular
We all use Smart phones but have you ever wondered how much data it
genertes in the form of
• text
• phone calls
• Emails
• Photos
• Videos Searches and
• Music
Approximately 40 Exabytes of data gets generated every month by a
single Smartphone User
Now imagine this number is multiplied by 5 billion smartphone users
that's a lot for our mind even process isn't it
40 Exabytes * 5,000,000,000 = ?
In fact this amout of data is quite a lot for traditional Computing
systems to handle and this massive amount of data is what we term as
Big data.

Let's have a look at the data generated per minute on the internet

Snapchat Google Facebook


2.1million sanps 3.8 million search 1.0 million
are shared on queries are made people log onto
snapchat on google facebook

Youtube
Email
4.5 million videos
188 million
are watched on
emails are sent
Youtube
So. How do you Classify any data as big data
this possible with the concept of 5 v's
•Volume
•Velocity
•Variety
•Veracity and
•Value
Let us understand this with an example from the health Care industry

Hospitals and Clinics across the world generate massive Volumes of data.
2,314 Exabytes of data are collected annually in form of patient
records and test results.
All this data is generated, at a very high speed, which attributes to the
velocity of big data
Variety refers to the Various data type such as structured, semi-
structured and Unstructured.
Example includes Excel records log files and X-ray images.
• Excel records -> structured
• Log File -> Semi-structured
• X-ray image -> unstructured
Accuracy and trustworthiness of the generated data is termed as
veracity

Analyzing all this data will benefit of the medical sector by enabling
faster disease detection ,better treatment, and reduced Cost this is
known as the value of Big data
Volume Velocity Variety

Patient Record • Structured


Hospitals & clinics • Semi-structured
data
consume 2314 • Unstructured
exabytes of data Test Results

Veracity Value

Accuracy • Diseases Detection


• Better Treatment
Trustworthy • Reduced Cost
how do we store and process this big data?
To do this job we have various Frame works, such as

• Cassandra
• hadoop &
• spark
Let us take hadoop as an example, and see how hadoop stores and
processes the big data.
Hadoop uses a distributed file system, Known as hadoop
distributed file system, to store big data if you have a huge file your file
will be broken down into & smaller Chunks, and stored in various
machines.

128MB

300MB 44MB

128MB
Not only that when you break the flie you also make copies of it
which goes into different nodes this. store your big data in a distributed
way, and make sure that even if one machine fails your data is Safe on
another

128MB A B

128MB A C

44MB C B
128MB A B

128MB A C

44MB C B
Mapreduce technique is used to process big data a lengthy task A is
broken into smaller task B, C and D
Now instead of one machine three machines take up each task
and complete in a parallel fashion and assemble the results at the end.

Task A

Task B Task C Task D

Result
Due to this the processing becomes easy and fast this is known as
parallel processing.
Now that we have stored and processed our big data We can
analyze this data for numerous applications

Stored and Processed


In games like holo 3 and call of duty designers analyze user data to.
understand at which stage most of the users pause. restart or quit playing
this Insight can help them rework on the story time. of the game and
improve The user experience which in turn reduces the customer Churn rate.

Similarly big data also helped with disaster management during hurricare
Sandy in 2012

It was used to gain better understanding of the Storm's effect on the east
coast of the U.S and necessary measure were taken it could Predict the
hurricane's landfall five days in advance. which wasn't possible earlier.

These are some of the Clear indication of how valuable big data can be
once it is accurately processed and analyzed
What is Hadoop big data tool in IoT?
Apache Hadoop is an open source framework that is used to efficiently
store and process large datasets ranging in size from gigabytes to
petabytes of data. Instead of using one large computer to store and
process the data, Hadoop allows clustering multiple computers to
analyze massive datasets in parallel more quickly.

What is Hadoop function in IoT?


Hadoop can tie together many commodity computers with single-CPU,
as a single functional distributed system, and practically, the clustered
machines can read the dataset in parallel and provide much higher
throughput. It is cheaper than one high-end server. Hadoop runs code
across a cluster of computers.
How data analytics is used in IoT?
By analyzing the data generated by sensors embedded in machines,
organizations can identify patterns that indicate potential equipment
failure. It enables organizations to schedule maintenance before a
failure occurs, reducing downtime and increasing efficiency.

What are the real time applications of IoT?


A combination of sensors in different capacities throughout the city for
various tasks such as managing the traffic, handling waste
management, optimizing streetlights, saving water, monitoring energy
expenditure, creating smart buildings, and more
Examples of IoT in Real Life
1. Home Automation
2. Wearable Health Monitors
3. Disaster Management
4. Biometric Security Systems
5. Smart Cars
6. Process Automation
7. Farming
8. Shopping Malls
Home Automation
Home automation is one of the best examples of IoT. Smart homes or
IoT-based home automation systems are becoming popular day by day.
In a smart home, consumer electronic gadgets such as lights, fans, air-
conditioners, etc. can be connected to each other via the internet. This
interconnection enables the user to operate these devices from a
distance. A smart home is capable of lighting control, energy
management, expansion, and remote access. Currently, this application
of IoT is not utilized at a large scale because the installation cost is too
high, which makes it difficult for a majority of people to afford it.
However, home automation holds quite a promising future.
Wearable Health Monitors
Wearable health monitors are both captivating and useful. They include
smart clothes, smart wristwear, and medical wearables that provide us
with high-quality health services. They are designed to track activities
such as pulse rate, step count, heart rate, etc. This data is recorded and
can be sent to the doctors for detailed fitness analysis. These IoT based
smart wearable devices are influencing our lifestyles a lot. Apart from
performing these basic operations, they can also raise an alarm and
send an alert in case of a medical emergency such as an asthma attack,
seizures, etc.
Disaster Management
IoT helps in the prediction and management of natural disasters, take
the example of forest fires. To avoid the chaos and destruction caused
by a forest fire, various sensors can be installed around the boundaries
of the forests. These sensors continuously monitor the temperature
and carbon content in the region. A detailed report is regularly sent to a
common monitoring hub. In case of a forest fire, an alert is sent to the
control room, police station, and fire brigade. Therefore, IoT helps in
staying prepared and respond swiftly in case of emergency.
Biometric Security Systems
A lot of security agencies make use of biometric systems to mark daily
attendance, allow access to the authorized personnel only, and other
related services. Advanced security, data communication, and
minimized human intervention are some of the features of IoT being
utilized in this sector. Biometric technology makes use of fingerprint,
voice, eye, and face recognition. The reliability of IoT based security
systems is higher than the manual or automated approach. The devices
used in biometric security systems are interlinked to each other and
possess the ability to dump the data after every usage to the host
computer. This scanned data is stored for future use, and the useful
information is retrieved as per requirement.
Smart Cars
IoT can be used to connect cars with each other in order to exchange
information like location, speed, and dynamics. An estimate shows that
by 2022, there will be 24 billion connected cars in the world. We use
IoT in our daily life without even realizing its presence. For example,
while finding the shortest route, while driving semi-automatic smart
cars, etc. IoT is also used in vehicle repair and maintenance. It does not
only remind the customer about the regular servicing date but also
assists the consumer in repair and maintenance by providing proper
guidance. On the basis of features provided, the communication
technique of connected
vehicle technology is classified into two
broad categories
• Vehicle-to-infrastructure (V2I)
It allows the smart car to run a diagnostic check and provide a
detailed analysis report to the user. It is also used to find out the
shortest route and to locate the empty parking spot.
• Vehicle-to-vehicle (V2V)
V2V communication of smart cars makes use of high-speed data
transfer and high-bandwidth. It lets the car to perform hefty tasks such
as avoiding collisions, clipping unnecessary traffic, etc.
Process Automation
In the manufacturing industry, performing reoccurring tasks, such as label
wrapping, packaging, etc., manually is difficult and is prone to human errors;
therefore, automation comes into play. For instance, take the example of a
cold drink manufacturing industry. Here, manufacturing machines and
conveyor belts are required to be interconnected in order to share
information, status, and data. This interconnection is IoT dependent. The
status of the manufactured product and the machine health report is sent to
the manufacturer at regular intervals in order to identify the faults in
advance. An IoT equipped industry is advantageous as it elevates the
production speed and maintains the uniform quality of the product
throughout the production. It also helps to make the workplace more
efficient and safe by reducing human error.
Farming
Due to climate change and water crisis, farmers go through a lot of
troubles such as crop flattening, soil erosion, drought, etc. These
problems can be easily suppressed by using IoT based farming system.
For example, the IoT based irrigation system makes use of a number of
sensors to monitor the moisture content of the soil. If the moisture level
drops below a certain range, it automatically turns on the irrigation
pump. Other than this, IoT also helps farmers to examine soil health.
Before planning to farm a new batch of crops, a farmer needs to recover
the soil nutrients. The IoT enriched software allows the user or the
farmer to select the best nutrient restoring crops. It also helps in sensing
the requirement of fertilizer and numerous other farming needs.
Shopping Malls
IoT finds its major application in shopping malls. In most of the malls, a
barcode scanner is used to scan the barcode present on every product.
After scanning, it extracts the necessary information and sends the data
to the host computer. The computer is further connected to a billing
machine that hands over the bill to the customer after proper
processing. All these devices are connected together with the help of
the Internet of Things.
which of the following Statements is not correct
about hadoop distributed file system (hdfs)

A) HDFS is the storge layer of Hadoop

B) Data gets stored in a distributed manner in HDFS

C)HDFS performe parallel processing of data

D)Smaller Chunks of data are stored on multiple data nodes in


HDFS.
Answer: (C)
HDFS does not perfom parallel processing as it is the work of
MapReduce.
HDFS is the storage layer of Hadoop that stored smaller
Chunks of data on multiple data nodes in a distributed Manner
Thankyou

You might also like