0% found this document useful (0 votes)

30 views9 pages

What Is Big Data Analytics-1

Hshsj

Uploaded by

ankit200211222

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views9 pages

What Is Big Data Analytics-1

Hshsj

Uploaded by

ankit200211222

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Benefits and Advantages of Big Data Analytics

1. Risk Management
Use Case: Banco de Oro, a Phillippine banking company, uses Big Data analytics to identify
fraudulent activities and discrepancies. The organization leverages it to narrow down a list of
suspects or root causes of problems.

2. Product Development and Innovations

Use Case: Rolls-Royce, one of the largest manufacturers of jet engines for airlines and armed
forces across the globe, uses Big Data analytics to analyze how efficient the engine designs are
and if there is any need for improvements.

3. Quicker and Better Decision Making Within Organizations

Use Case: Starbucks uses Big Data analytics to make strategic decisions. For example, the
company leverages it to decide if a particular location would be suitable for a new outlet or not.
They will analyze several different factors, such as population, demographics, accessibility of the
location, and more.

4. Improve Customer Experience

Use Case: Delta Air Lines uses Big Data analysis to improve customer experiences. They monitor
tweets to find out their customers’ experience regarding their journeys, delays, and so on. The
airline identifies negative tweets and does what’s necessary to remedy the situation. By publicly
addressing these issues and offering solutions, it helps the airline build good customer relations.

The Lifecycle Phases of Big Data Analytics

Now, let’s review how Big Data analytics works:

• Stage 1 - Business case evaluation - The Big Data analytics lifecycle begins with a business case,
which defines the reason and goal behind the analysis.

• Stage 2 - Identification of data - Here, a broad variety of data sources are identified.

• Stage 3 - Data filtering - All of the identified data from the previous stage is filtered here to
remove corrupt data.
• Stage 4 - Data extraction - Data that is not compatible with the tool is extracted and then
transformed into a compatible form.

• Stage 5 - Data aggregation - In this stage, data with the same fields across different datasets are
integrated.

• Stage 6 - Data analysis - Data is evaluated using analytical and statistical tools to discover useful
information.

• Stage 7 - Visualization of data - With tools like Tableau, Power BI, and QlikView, Big Data analysts
can produce graphic visualizations of the analysis.

• Stage 8 - Final analysis result - This is the last step of the Big Data analytics lifecycle, where the
final results of the analysis are made available to business stakeholders who will take action.

Different Types of Big Data Analytics

Here are the four types of Big Data analytics:

1. Descriptive Analytics

This summarizes past data into a form that people can easily read. This helps in creating reports, like a
company’s revenue, profit, sales, and so on. Also, it helps in the tabulation of social media metrics.

Use Case: The Dow Chemical Company analyzed its past data to increase facility utilization across its
office and lab space. Using descriptive analytics, Dow was able to identify underutilized space. This space
consolidation helped the company save nearly US $4 million annually.

2. Diagnostic Analytics

This is done to understand what caused a problem in the first place. Techniques like drill-down, data
mining, and data recovery are all examples. Organizations use diagnostic analytics because they provide
an in-depth insight into a particular problem.

Use Case: An e-commerce company’s report shows that their sales have gone down, although
customers are adding products to their carts. This can be due to various reasons like the form didn’t load
correctly, the shipping fee is too high, or there are not enough payment options available. This is where
you can use diagnostic analytics to find the reason.
3. Predictive Analytics

This type of analytics looks into the historical and present data to make predictions of the future.
Predictive analytics uses data mining, AI, and machine learning to analyze current data and make
predictions about the future. It works on predicting customer trends, market trends, and so on.

Use Case: PayPal determines what kind of precautions they have to take to protect their clients against
fraudulent transactions. Using predictive analytics, the company uses all the historical payment data and
user behavior data and builds an algorithm that predicts fraudulent activities.

4. Prescriptive Analytics

This type of analytics prescribes the solution to a particular problem. Perspective analytics works with
both descriptive and predictive analytics. Most of the time, it relies on AI and machine learning.

Use Case: Prescriptive analytics can be used to maximize an airline’s profit. This type of analytics is used
to build an algorithm that will automatically adjust the flight fares based on numerous factors, including
customer demand, weather, destination, holiday seasons, and oil prices.

Big Data Analytics Tools

Here are some of the key big data analytics tools :

• Hadoop - helps in storing and analyzing data

• MongoDB - used on datasets that change frequently

• Talend - used for data integration and management

• Cassandra - a distributed database used to handle chunks of data

• Spark - used for real-time processing and analyzing large amounts of data

• STORM - an open-source real-time computational system

• Kafka - a distributed streaming platform that is used for fault-tolerant storage

Big Data Industry Applications

Here are some of the sectors where Big Data is actively used:
• Ecommerce - Predicting customer trends and optimizing prices are a few of the ways e-
commerce uses Big Data analytics

• Marketing - Big Data analytics helps to drive high ROI marketing campaigns, which result in
improved sales

• Education - Used to develop new and improve existing courses based on market requirements

• Healthcare - With the help of a patient’s medical history, Big Data analytics is used to predict
how likely they are to have health issues

• Media and entertainment - Used to understand the demand of shows, movies, songs, and more
to deliver a personalized recommendation list to its users

• Banking - Customer income and spending patterns help to predict the likelihood of choosing
various banking offers, like loans and credit cards

• Telecommunications - Used to forecast network capacity and improve customer experience

• Government - Big Data analytics helps governments in law enforcement, among other things

Non-definitional traits of big data

Big data is a collection of data from many sources and is often described by five characteristics, known as
the 5 V's: volume, velocity, value, variety, and veracity. These characteristics help to understand the
complexity of big data and can help data scientists derive more value from their data.

Volume: Volume refers to the 'size'or amount of data. For instance, YouTube has over 2.6 billion monthly
active users and generates a large amount of data daily, which can't be processed manually; thus,
modern techniques and tools are used to handle such voluminous data.

Velocity: Velocity refers to the 'speed'or rate with which the data is accumulated. In 2010, YouTube
had 200 million monthly active users, which increased to 2.6 billion in 2022.

Variety: Variety refers to the 'heterogeneity' or diversity of data. The data can be structured,
unstructured, or semi-structured.

Veracity: Veracity refers to the 'trustworthiness'or quality of data. It means whether the data is free
from various ambiguities or not.

Value: Value refers to the 'Insights' gained from the data. It means whether the given data set is
producing any useful result. Data, in its raw form, gives no valuable result, but once processed efficiently,
it can give us important insights that could help us in decision-making.

Other characteristics of big data include:

• Variability

A sixth V term that's sometimes used to describe big data

• Visualization

The use of tools like charts, graphs, and maps to create visual representations of data

• Data provenance

Checking the origin of a piece of data, and the processes and techniques used to produce it

• Transparency

The right of a person to know whether a company collects, uses, or processes their personal data

Distributed File System (DFS)

DFS (Distributed File System) is a technology that allows you to group shared folders located on different
servers into one or more logically structured namespaces. The main purpose of the Distributed File
System (DFS) is to allows users of physically distributed systems to share their data and resources by
using a Common File System. A collection of workstations and mainframes connected by a Local Area
Network (LAN) is a configuration on Distributed File System. A DFS is executed as a part of the operating
system. In DFS, a namespace is created and this process is transparent for the clients.

Components of DFS

• Location Transparency: Location Transparency achieves through the namespace component.

• Redundancy: Redundancy is done through a file replication component.

In the case of failure and heavy load, these components together improve data availability by allowing
the sharing of data in different locations to be logically grouped under one folder, which is known as the
“DFS root”. It is not necessary to use both the two components of DFS together, it is possible to use the
namespace component without using the file replication component and it is perfectly possible to use
the file replication component without using the namespace component between servers.

Distributed File System Replication

Early iterations of DFS made use of Microsoft’s File Replication Service (FRS), which allowed for
straightforward file replication between servers. The most recent iterations of the whole file are
distributed to all servers by FRS, which recognises new or updated files. “DFS Replication” was developed
by Windows Server 2003 R2 (DFSR). By only copying the portions of files that have changed and
minimising network traffic with data compression, it helps to improve FRS. Additionally, it provides users
with flexible configuration options to manage network traffic on a configurable schedule.

Features of DFS
• Transparency

o Structure transparency: There is no need for the client to know about the number or
locations of file servers and the storage devices. Multiple file servers should be provided
for performance, adaptability, and dependability.

o Access transparency: Both local and remote files should be accessible in the same
manner. The file system should be automatically located on the accessed file and send it
to the client’s side.

o Naming transparency: There should not be any hint in the name of the file to the
location of the file. Once a name is given to the file, it should not be changed during
transferring from one node to another.

o Replication transparency: If a file is copied on multiple nodes, both the copies of the file
and their locations should be hidden from one node to another.

• User mobility: It will automatically bring the user’s home directory to the node where the user
logs in.

• Performance: Performance is based on the average amount of time needed to convince the
client requests. This time covers the CPU time + time taken to access secondary storage +
network access time. It is advisable that the performance of the Distributed File System be
similar to that of a centralized file system.

• Simplicity and ease of use: The user interface of a file system should be simple and the number
of commands in the file should be small.

• High availability: A Distributed File System should be able to continue in case of any partial
failures like a link failure, a node failure, or a storage drive crash.
A high authentic and adaptable distributed file system should have different and independent
file servers for controlling different and independent storage devices.

• Scalability: Since growing the network by adding new machines or joining two networks
together is routine, the distributed system will inevitably grow over time. As a result, a good
distributed file system should be built to scale quickly as the number of nodes and users in the
system grows. Service should not be substantially disrupted as the number of nodes and users
grows.

• Data integrity: Multiple users frequently share a file system. The integrity of data saved in a
shared file must be guaranteed by the file system. That is, concurrent access requests from many
users who are competing for access to the same file must be correctly synchronized using a
concurrency control method. Atomic transactions are a high-level concurrency management
mechanism for data integrity that is frequently offered to users by a file system.
• Security: A distributed file system should be secure so that its users may trust that their data will
be kept private. To safeguard the information contained in the file system from unwanted &
unauthorized access, security mechanisms must be implemented.

Applications of DFS

• NFS: NFS stands for Network File System. It is a client-server architecture that allows a computer
user to view, store, and update files remotely. The protocol of NFS is one of the several
distributed file system standards for Network-Attached Storage (NAS).

• CIFS: CIFS stands for Common Internet File System. CIFS is an accent of SMB. That is, CIFS is an
application of SIMB protocol, designed by Microsoft.

• SMB: SMB stands for Server Message Block. It is a protocol for sharing a file and was invented by
IMB. The SMB protocol was created to allow computers to perform read and write operations on
files to a remote host over a Local Area Network (LAN). The directories present in the remote
host can be accessed via SMB and are called as “shares”.

• Hadoop: Hadoop is a group of open-source software services. It gives a software framework for
distributed storage and operating of big data using the MapReduce programming model. The
core of Hadoop contains a storage part, known as Hadoop Distributed File System (HDFS), and an
operating part which is a MapReduce programming model.

• NetWare: NetWare is an abandon computer network operating system developed by Novell, Inc.
It primarily used combined multitasking to run different services on a personal computer, using
the IPX network protocol.

Working of DFS

There are two ways in which DFS can be implemented:

• Standalone DFS namespace: It allows only for those DFS roots that exist on the local computer
and are not using Active Directory. A Standalone DFS can only be acquired on those computers
on which it is created. It does not provide any fault liberation and cannot be linked to any other
DFS. Standalone DFS roots are rarely come across because of their limited advantage.

• Domain-based DFS namespace: It stores the configuration of DFS in Active Directory, creating
the DFS namespace root accessible at \\<domainname>\<dfsroot> or \\<FQDN>\<dfsroot>
Advantages of Distributed File System(DFS)

• DFS allows multiple user to access or store the data.

• It allows the data to be share remotely.

• It improved the availability of file, access time, and network efficiency.

• Improved the capacity to change the size of the data and also improves the ability to exchange
the data.

• Distributed File System provides transparency of data even if server or disk fails.

Disadvantages of Distributed File System(DFS)

• In Distributed File System nodes and connections needs to be secured therefore we can say that
security is at stake.

• There is a possibility of lose of messages and data in the network while movement from one
node to another.

• Database connection in case of Distributed File System is complicated.

• Also handling of the database is not easy in Distributed File System as compared to a single user
system.

• There are chances that overloading will take place if all nodes tries to send data at once.

Big Data
No ratings yet
Big Data
16 pages
Unit I - BigData
No ratings yet
Unit I - BigData
47 pages
CS 329 Lecture One 2025
No ratings yet
CS 329 Lecture One 2025
28 pages
Big Data Analytics
No ratings yet
Big Data Analytics
37 pages
Drivers For Big Data
No ratings yet
Drivers For Big Data
7 pages
Present
No ratings yet
Present
6 pages
Big Data Analytics Unit-I
No ratings yet
Big Data Analytics Unit-I
38 pages
1.2 Big Data
No ratings yet
1.2 Big Data
23 pages
CC Unit 4
No ratings yet
CC Unit 4
22 pages
Big Data Analytics
No ratings yet
Big Data Analytics
9 pages
Lecture 3-Introduction To Big Data
No ratings yet
Lecture 3-Introduction To Big Data
25 pages
Big Data Analysis by Deshbandhu
No ratings yet
Big Data Analysis by Deshbandhu
368 pages
Big Data
No ratings yet
Big Data
54 pages
UNIT Two Emerging Technology
No ratings yet
UNIT Two Emerging Technology
43 pages
Big Data
No ratings yet
Big Data
13 pages
Unit 1 - Understanding Big Data
No ratings yet
Unit 1 - Understanding Big Data
39 pages
Big Data
No ratings yet
Big Data
28 pages
Kwasu-Csc204 Module 1 Big Data Computing and Security 2
No ratings yet
Kwasu-Csc204 Module 1 Big Data Computing and Security 2
22 pages
What Is Big Data
No ratings yet
What Is Big Data
4 pages
Kwasu-Csc204 Big Data Computing and Security-1
No ratings yet
Kwasu-Csc204 Big Data Computing and Security-1
57 pages
Big Data Analytics 1
No ratings yet
Big Data Analytics 1
21 pages
Big Data Analytics
No ratings yet
Big Data Analytics
6 pages
Big Data Analytics Overview and Insights
No ratings yet
Big Data Analytics Overview and Insights
20 pages
Big Data and Data Analysis: Offurum Paschal I Kunoch Education and Training College, Owerri
No ratings yet
Big Data and Data Analysis: Offurum Paschal I Kunoch Education and Training College, Owerri
35 pages
Introduction Part
No ratings yet
Introduction Part
5 pages
Data, Big
No ratings yet
Data, Big
90 pages
Big Data Analytics - CCS334 - Notes - Unit 1 - Understanding Big Data
No ratings yet
Big Data Analytics - CCS334 - Notes - Unit 1 - Understanding Big Data
40 pages
Big Data Analytics - Drivers
No ratings yet
Big Data Analytics - Drivers
39 pages
Big Data Analytics Overview and Benefits
No ratings yet
Big Data Analytics Overview and Benefits
34 pages
Introduction
No ratings yet
Introduction
17 pages
Big Data Analytics
No ratings yet
Big Data Analytics
19 pages
Unit 2 Notes Data Analytics
No ratings yet
Unit 2 Notes Data Analytics
11 pages
Unit - 1 Bda
No ratings yet
Unit - 1 Bda
14 pages
Lecture 3-Introduction To Big Data
No ratings yet
Lecture 3-Introduction To Big Data
25 pages
Big Data Analytics Project Proposal by Slidesgo
No ratings yet
Big Data Analytics Project Proposal by Slidesgo
12 pages
Big Data Analytics - Unit 1
No ratings yet
Big Data Analytics - Unit 1
29 pages
Understanding Big Data Characteristics
No ratings yet
Understanding Big Data Characteristics
22 pages
Big Data Analytics02
No ratings yet
Big Data Analytics02
20 pages
Big Data Analytics - CCS334 - Notes - ALL UNITS NOTES
No ratings yet
Big Data Analytics - CCS334 - Notes - ALL UNITS NOTES
130 pages
Big Data: A Beginner's Handbook
No ratings yet
Big Data: A Beginner's Handbook
15 pages
Business Analytics
No ratings yet
Business Analytics
34 pages
Content For
No ratings yet
Content For
7 pages
Big Data Analytics
No ratings yet
Big Data Analytics
127 pages
Big Data
No ratings yet
Big Data
47 pages
Understanding Big Data Analytics
No ratings yet
Understanding Big Data Analytics
11 pages
Introduction To Big Data Unit - 2
No ratings yet
Introduction To Big Data Unit - 2
75 pages
Unit 1 - Big Data Analytics - CCS334
No ratings yet
Unit 1 - Big Data Analytics - CCS334
35 pages
Week 5 Big Data Application in Business
No ratings yet
Week 5 Big Data Application in Business
51 pages
ETB 1 (Big Data)
No ratings yet
ETB 1 (Big Data)
28 pages
FUNDAMENTALS OF BIG DATA ANALYTICS Digital Notes
No ratings yet
FUNDAMENTALS OF BIG DATA ANALYTICS Digital Notes
121 pages
Big Data Analytics
No ratings yet
Big Data Analytics
12 pages
Big Data
No ratings yet
Big Data
69 pages
Introduction to Big Data Analytics
No ratings yet
Introduction to Big Data Analytics
16 pages
Unit 1 BDA
No ratings yet
Unit 1 BDA
38 pages
Big Data Analytics Understanding The Power of Data
No ratings yet
Big Data Analytics Understanding The Power of Data
9 pages
PM Internship Scheme 18112024
No ratings yet
PM Internship Scheme 18112024
1 page
Midterm2018 Solutions
No ratings yet
Midterm2018 Solutions
15 pages
Final 2018
No ratings yet
Final 2018
15 pages
Final2019 Solutions
No ratings yet
Final2019 Solutions
23 pages
Cwe Unit 2 (Part 1)
No ratings yet
Cwe Unit 2 (Part 1)
81 pages
Build Mercari 2025 Software Engineer 26122024
No ratings yet
Build Mercari 2025 Software Engineer 26122024
1 page
Emotional Stability Trait 1 - SELF-ESTEEM
No ratings yet
Emotional Stability Trait 1 - SELF-ESTEEM
15 pages
LET Reviewer Professional Education SOCIAL DIMENSIONS Questionnaire Part 1
100% (2)
LET Reviewer Professional Education SOCIAL DIMENSIONS Questionnaire Part 1
11 pages
Humayun's Tomb
100% (1)
Humayun's Tomb
13 pages
Proposal Pratmf Di HJ
No ratings yet
Proposal Pratmf Di HJ
10 pages
INDEX (GUIDE 3rd Edition, 2018)
No ratings yet
INDEX (GUIDE 3rd Edition, 2018)
16 pages
Shingo Model Booklet 13 5.3.18 Small PDF
No ratings yet
Shingo Model Booklet 13 5.3.18 Small PDF
52 pages
LHOD and Her HSC English Extension 1 Essay
No ratings yet
LHOD and Her HSC English Extension 1 Essay
3 pages
A Fresh Look at Scriptural Baptism - Bynum
No ratings yet
A Fresh Look at Scriptural Baptism - Bynum
6 pages
Hotel Front Office Roleplay Script
No ratings yet
Hotel Front Office Roleplay Script
2 pages
Skift TravelMedia Social Media Customer Service
No ratings yet
Skift TravelMedia Social Media Customer Service
19 pages
I Sem Bcom&BBA English - Study Material
No ratings yet
I Sem Bcom&BBA English - Study Material
29 pages
APAS User Guide - Final 30.07.25
No ratings yet
APAS User Guide - Final 30.07.25
15 pages
NPV and IRR Analysis for Projects A & B
No ratings yet
NPV and IRR Analysis for Projects A & B
6 pages
CDS6 and CDS7 Billing Codes Overview
No ratings yet
CDS6 and CDS7 Billing Codes Overview
9 pages
Docking Aging Ar Juli 2024
No ratings yet
Docking Aging Ar Juli 2024
108 pages
2023 Accounting Grade 11 Project - MG - 240516 - 175446
No ratings yet
2023 Accounting Grade 11 Project - MG - 240516 - 175446
4 pages
SoP - MSC Social Innovation and Enreprenuership (Samrat Pawar)
No ratings yet
SoP - MSC Social Innovation and Enreprenuership (Samrat Pawar)
2 pages
Types of Pronoun PDF
No ratings yet
Types of Pronoun PDF
3 pages
Vishwamitri River Management Report
No ratings yet
Vishwamitri River Management Report
2 pages
GSIS Enrollment for MarQuez Staff
No ratings yet
GSIS Enrollment for MarQuez Staff
2 pages
Gypsy Sorcery and Fortune Telling
No ratings yet
Gypsy Sorcery and Fortune Telling
249 pages
Bill Florida Orignal 1
No ratings yet
Bill Florida Orignal 1
1 page
M3 - Hotel Organization and Its Functions
No ratings yet
M3 - Hotel Organization and Its Functions
60 pages
M.Connor Resume 09
No ratings yet
M.Connor Resume 09
1 page
C01 - PPTs - Guffey - BCPP 7bce
No ratings yet
C01 - PPTs - Guffey - BCPP 7bce
27 pages
The Order of Mass
No ratings yet
The Order of Mass
12 pages
Foreign Policy in A Transformed World Mark Webber Michael Smith Download
100% (9)
Foreign Policy in A Transformed World Mark Webber Michael Smith Download
77 pages
TTUTA Code Ethics
No ratings yet
TTUTA Code Ethics
14 pages
The Annaist
No ratings yet
The Annaist
17 pages
LessonFlow02 09
No ratings yet
LessonFlow02 09
5 pages

What Is Big Data Analytics-1

Uploaded by

What Is Big Data Analytics-1

Uploaded by

Benefits and Advantages of Big Data Analytics

2. Product Development and Innovations

3. Quicker and Better Decision Making Within Organizations

4. Improve Customer Experience

The Lifecycle Phases of Big Data Analytics

Now, let’s review how Big Data analytics works:

Different Types of Big Data Analytics

Here are the four types of Big Data analytics:

Big Data Analytics Tools

Here are some of the key big data analytics tools :

• Hadoop - helps in storing and analyzing data

• MongoDB - used on datasets that change frequently

• Talend - used for data integration and management

• Cassandra - a distributed database used to handle chunks of data

• STORM - an open-source real-time computational system

• Kafka - a distributed streaming platform that is used for fault-tolerant storage

Big Data Industry Applications

• Telecommunications - Used to forecast network capacity and improve customer experience

Non-definitional traits of big data

Other characteristics of big data include:

A sixth V term that's sometimes used to describe big data

Distributed File System (DFS)

• Location Transparency: Location Transparency achieves through the namespace component.

• Redundancy: Redundancy is done through a file replication component.

Distributed File System Replication

There are two ways in which DFS can be implemented:

• DFS allows multiple user to access or store the data.

• It allows the data to be share remotely.

• It improved the availability of file, access time, and network efficiency.

Disadvantages of Distributed File System(DFS)

• Database connection in case of Distributed File System is complicated.

You might also like