0% found this document useful (0 votes)

54 views22 pages

UNIT 2 Notes by ARUN JHAPATE

The document discusses big data, including definitions from Gartner and Wikipedia. It describes the four V's of big data - volume, velocity, variety, and visibility. Big data analytics examines large amounts of data to uncover patterns and insights. Applications of big data analytics include healthcare, manufacturing, media/entertainment, IoT, and government. Popular big data technologies include Hadoop, Spark, R, data lakes, NoSQL databases, and predictive analytics.

Uploaded by

Ankit “अंकित मौर्य” Mourya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views22 pages

UNIT 2 Notes by ARUN JHAPATE

Uploaded by

Ankit “अंकित मौर्य” Mourya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

UNIT-2

What is Big Data? And Why Is It Important to Me?

According to Gartner:
Big data is high-volume, high-velocity and high-variety information assets that demand
cost-effective, innovative forms of information processing for enhanced insight and
decision making
From Wikipedia:
Big Data is a broad term for data sets so large or complex that they are difficult to process
using traditional data processing applications. Challenges include analysis, capture,
curation, search, sharing, storage, transfer, visualization, and information privacy.

Four V’s of Big Data

The IT industry, in an attempt to quantify what is and isn‘t Big Data, has come up with
what are known as the ―V‘s‖ of Big Data. The foundational three are:

 Volume: The amount of data is immense. Each day 2.3 trillion gigabytes of new
data is being created.
 Velocity: The speed of data (always in flux) and processing (analysis of streaming
data to produce near or real time results)
 Variety: The different types of data, structured, as well as, unstructured.
 Visibility Dimension: This dimension refers to a customers‘ ability to see, track
their experience or order through the operations process. A high visibility
dimension includes courier companies where you can track your package online or
a retail store where you pick up the goods and purchase them over the counter.
 Value: Value is the end game. After addressing volume, velocity, variety,
variability, veracity, and visualization – which takes a lot of time, effort and
resources – you want to be sure your organization is getting value from the data.
 Variability: Variability is different from variety. A coffee shop may offer 6
different blends of coffee, but if you get the same blend every day and it tastes
different every day, that is variability. The same is true of data; if the meaning is
constantly changing it can have a huge impact on your data homogenization.

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
It is the combination of these factors, high-volume, high-velocity and high-variety that
serves as the basis for data to be termed Big Data. Big Data platforms and solutions
provide the tools, methods and technologies used to capture, curate, store and search
& analyze the data to find new correlations, relationships and trends that were previously
unavailable.

Introduction to Big Data Analytics

Big data analytics examines large amounts of data to uncover hidden patterns,
correlations and other insights. With today‘s technology, it‘s possible to analyze your
data and get answers from it immediately. Big Data Analytics helps you to understand
your organization better. With the use of Big data analytics, one can make informed
decisions without blindly relying on guesses.

And it can help answer the following types of questions:

 What actually happened?

 How or why did it happen?
 What‘s happening now?
 What is likely to happen next?

Big Data Analytics Applications:

The primary goal of Big Data applications is to help companies make more informative
business decisions by analyzing large volumes of data. It could include web server logs,
Internet click stream data, social media content and activity reports, text from customer
emails, mobile phone call details and machine data captured by multiple sensors.

Organisations from different domain are investing in Big Data applications, for
examining large data sets to uncover all hidden patterns, unknown correlations, market
trends, customer preferences and other useful business information. In this blog we will
we be covering:

 Big Data Applications in Healthcare

 Big Data Applications in Manufacturing
 Big Data Applications in Media & Entertainment
 Big Data Applications in IoT
 Big Data Applications in Government

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
Big Data Applications: Healthcare

 Big Data Applications: Manufacturing

o Product quality and defects tracking
o Supply planning
o Manufacturing process defect tracking
o Output forecasting
o Increasing energy efficiency
o Testing and simulation of new manufacturing processes
o Support for mass-customization of manufacturing


 Big Data Applications: Media & Entertainment

o Predicting what the audience wants

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
o Scheduling optimization
o Increasing acquisition and retention
o Ad targeting
o Content monetization and new product development

Big Data Applications: Internet of Things (IoT)

Big Data Applications: Government

Cyber security & Intelligence
Crime Prediction and Prevention
Pharmaceutical Drug Evaluation
Scientific Research
Weather Forecasting

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
BIG DATA TECHNOLOGIES

The list of technology vendors offering big data solutions is seemingly infinite. Many of
the big data solutions that are particularly popular right now fit into one of the following
15 categories:

1. The Hadoop Ecosystem

While Apache Hadoop may not be as dominant as it once was, it's nearly impossible to
talk about big data without mentioning this open source framework for distributed
processing of large data sets. Last year, Forrester predicted, "100% of all large enterprises
will adopt it (Hadoop and related technologies such as Spark) for big data
analytics within the next two years."

Over the years, Hadoop has grown to encompass an entire ecosystem of related software,
and many commercial big data solutions are based on Hadoop. In fact, Zion Market
Research forecasts that the market for Hadoop-based products and services will continue
to grow at a 50 percent CAGR through 2022, when it will be worth $87.14 billion, up
from $7.69 billion in 2016.

Key Hadoop vendors include Cloudera, Hortonworks and MapR, and the leading public
clouds all offer services that support the technology.

2. Spark

Apache Spark is part of the Hadoop ecosystem, but its use has become so widespread that
it deserves a category of its own. It is an engine for processing big data within Hadoop,
and it's up to one hundred times faster than the standard Hadoop engine, MapReduce.

In the AtScale 2016 Big Data Maturity Survey, 25 percent of respondents said that they
had already deployed Spark in production, and 33 percent more had Spark projects in
development. Clearly, interest in the technology is sizable and growing, and many
vendors with Hadoop offerings also offer Spark-based products.

open source project, is a programming language and software environment designed for
working with statistics. The darling of data scientists, it is managed by the R Foundation
and available under the GPL 2 license. Many popular integrated development
environments (IDEs), including Eclipse and Visual Studio, support the language.
Several organizations that rank the popularity of various programming languages say that
R has become one of the most popular languages in the world. For example,
the IEEE says that R is the fifth most popular programming language, and

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
both Tiobe and RedMonk rank it 14th. This is significant because the programming
languages near the top of these charts are usually general-purpose languages that can be
used for many different kinds of work. For a language that is used almost exclusively for
big data projects to be so near the top demonstrates the significance of big data and the
importance of this language in its field.

4. Data Lakes

To make it easier to access their vast stores of data, many enterprises are setting up data
lakes. These are huge data repositories that collect data from many different sources and
store it in its natural state. This is different than a data warehouse, which also collects
data from disparate sources, but processes it and structures it for storage. In this case, the
lake and warehouse metaphors are fairly accurate. If data is like water, a data lake is
natural and unfiltered like a body of water, while a data warehouse is more like a
collection of water bottles stored on shelves.

Data lakes are particularly attractive when enterprises want to store data but aren't yet
sure how they might use it. A lot of Internet of Things (IoT) data might fit into that
category, and the IoT trend is playing into the growth of data lakes.

MarketsandMarkets predicts that data lake revenue will grow from $2.53 billion in 2016
to $8.81 billion by 2021.

5. NoSQL Databases

Traditional relational database management systems (RDBMSes) store information

in structured, defined columns and rows. Developers and database administrators query,
manipulate and manage the data in those RDBMSes using a special language known as
SQL.

NoSQL databases specialize in storing unstructured data and providing fast performance,
although they don't provide the same level of consistency as RDBMSes. Popular NoSQL
databases include MongoDB, Redis, Cassandra, Couchbase and many others; even the
leading RDBMS vendors like Oracle and IBM now also offer NoSQL databases.

NoSQL databases have become increasingly popular as the big data trend has grown.
According to Allied Market Research the NoSQL market could be worth $4.2 billion by
2020. However, the market for RDBMSes is still much, much larger than the market for
NoSQL.

MonboDB is one of several well-known NoSQL databases.

6. Predictive Analytics
Predictive analytics is a sub-set of big data analytics that attempts to forecast future
events or behavior based on historical data. It draws on data mining, modeling and
machine learning techniques to predict what will happen next. It is often used for fraud
detection, credit scoring, marketing, finance and business analysis purposes.

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
In recent years, advances in artificial intelligence have enabled vast improvements in the
capabilities of predictive analytics solutions. As a result, enterprises have begun to invest
more in big data solutions with predictive capabilities. Many vendors, including
Microsoft, IBM, SAP, SAS, Statistica, RapidMiner, KNIME and others, offer predictive
analytics solutions. Zion Market Research says the Predictive Analytics market generated
$3.49 billion in revenue in 2016, a number that could reach $10.95 billion by 2022.

7. In-Memory Databases

In any computer system, the memory, also known as the RAM, is orders of magnitude
faster than the long-term storage. If a big data analytics solution can process data that is
stored in memory, rather than data stored on a hard drive, it can perform dramatically
faster. And that's exactly what in-memory database technology does.

Many of the leading enterprise software vendors, including SAP, Oracle, Microsoft and
IBM, now offer in-memory database technology. In addition, several smaller companies
like Teradata, Tableau, Volt DB and DataStax offer in-memory database solutions.
Research from MarketsandMarkets estimates that total sales of in-memory technology
were $2.72 billion in 2016 and may grow to $6.58 billion by 2021.

8. Big Data Security Solutions

Because big data repositories present an attractive target to hackers and advanced
persistent threats, big data security is a large and growing concern for enterprises. In the
AtScale survey, security was the second fastest-growing area of concern related to big
data.

According to the IDG report, the most popular types of big data security solutions include
identity and access controls (used by 59 percent of respondents), data encryption (52
percent) and data segregation (42 percent). Dozens of vendors offer big data security
solutions, and Apache Ranger, an open source project from the Hadoop ecosystem, is
also attracting growing attention.

9. Big Data Governance Solutions

Closely related to the idea of security is the concept of governance. Data governance is a
broad topic that encompasses all the processes related to the availability, usability and
integrity of data. It provides the basis for making sure that the data used for big data
analytics is accurate and appropriate, as well as providing an audit trail so that business
analysts or executives can see where data originated.

In the New Vantage Partners survey, 91.8 percent of the Fortune 1000 executives
surveyed said that governance was either critically important (52.5 percent) or important
(39.3 percent) to their big data initiatives. Vendors offering big data governance tools
include Collibra, IBM, SAS, Informatics, Adaptive and SAP.

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
10. Self-Service Capabilities
With data scientists and other big data experts in short supply — and commanding large
salaries — many organizations are looking for big data analytics tools that allow business
users to self-service their own needs. In fact, a report from Research and
Markets estimates that the self-service business intelligence market generated $3.61
billion in revenue in 2016 and could grow to $7.31 billion by 2021. And Gartner has
noted, "The modern BI and analytics platform emerged in the last few years to meet new
organizational requirements for accessibility, agility and deeper analytical insight,
shifting the market from IT-led, system-of-record reporting to business-led, agile
analytics including self-service."

Hoping to take advantage of this trend, multiple business intelligence and big data
analytics vendors, such as Tableau, Microsoft, IBM, SAP, Splunk, Syncsort, SAS,
TIBCO, Oracle and other have added self-service capabilities to their solutions. Time will
tell whether any or all of the products turn out to be truly usable by non-experts and
whether they will provide the business value organizations are hoping to achieve with
their big data initiatives.

11. Artificial Intelligence

While the concept of artificial intelligence (AI) has been around nearly as long as there
have been computers, the technology has only become truly usable within the past couple
of years. In many ways, the big data trend has driven advances in AI, particularly in two
subsets of the discipline: machine learning and deep learning.

The standard definition of machine learning is that it is technology that gives "computers
the ability to learn without being explicitly programmed." In big data analytics, machine
learning technology allows systems to look at historical data, recognize patterns, build
models and predict future outcomes. It is also closely associated with predictive analytics.

Deep learning is a type of machine learning technology that relies on artificial neural
networks and uses multiple layers of algorithms to analyze data. As a field, it holds a lot
of promise for allowing analytics tools to recognize the content in images and videos and
then process it accordingly.

Experts say this area of big data tools seems poised for a dramatic takeoff. IDC has
predicted, "By 2018, 75 percent of enterprise and ISV development will include
cognitive/AI or machine learning functionality in at least one application, including all
business analytics tools."

Leading AI vendors with tools related to big data include Google, IBM, Microsoft and
Amazon Web Services, and dozens of small startups are developing AI technology (and
getting acquired by the larger technology vendors).

12. Streaming analytics

As organizations have become more familiar with the capabilities of big data analytics
solutions, they have begun demanding faster and faster access to insights. For these

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
31

enterprises, streaming analytics with the ability to analyze data as it is being created, is
something of a holy grail. They are looking for solutions that can accept input from
multiple disparate sources, process it and return insights immediately — or as close to it
as possible. This is particular desirable when it comes to new IoT deployments, which are
helping to drive the interest in streaming big data analytics.

Several vendors offer products that promise streaming analytics capabilities. They
include IBM, Software AG, SAP, TIBCO, Oracle, DataTorrent, SQLstream, Cisco,
Informatica and others. MarketsandMarkets believes the streaming analytics solutions
brought in $3.08 billion in revenue in 2016, which could increase to $13.70 billion by
2021.

13. Edge Computing

In addition to spurring interest in streaming analytics, the IoT trend is also generating
interest in edge computing. In some ways, edge computing is the opposite of cloud
computing. Instead of transmitting data to a centralized server for analysis, edge
computing systems analyze data very close to where it was created — at the edge of the
network.

The advantage of an edge computing system is that it reduces the amount of information
that must be transmitted over the network, thus reducing network traffic and related costs.
It also decreases demands on data centers or cloud computing facilities, freeing up
capacity for other workloads and eliminating a potential single point of failure.

While the market for edge computing, and more specifically for edge computing
analytics, is still developing, some analysts and venture capitalists have begun calling the
technology the "next big thing."

14. Blockchain

Also a favorite with forward-looking analysts and venture capitalists, blockchain is the
distributed database technology that underlies Bitcoin digital currency. The unique
feature of a blockchain database is that once data has been written, it cannot be deleted or
changed after the fact. In addition, it is highly secure, which makes it an excellent choice
for big data applications in sensitive industries like banking, insurance, health care, retail
and others.

Blockchain technology is still in its infancy and use cases are still developing. However,
several vendors, including IBM, AWS, Microsoft and multiple startups, have rolled out
experimental or introductory solutions built on blockchain technology.

15. Prescriptive Analytics

Many analysts divide big data analytics tools into four big categories. The first,
descriptive analytics, simply tells what happened. The next type, diagnostic analytics,
goes a step further and provides a reason for why events occurred. The third type,

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
32

predictive analytics, discussed in depth above, attempts to determine what will happen
next. This is as sophisticated as most analytics tools currently on the market can get.

However, there is a fourth type of analytics that is even more sophisticated, although very
few products with these capabilities are available at this time. Prescriptive analytics
offers advice to companies about what they should do in order to make a desired result
happen. For example, while predictive analytics might give a company a warning that the
market for a particular product line is about to decrease, prescriptive analytics will
analyze various courses of action in response to those market changes and forecast the
most likely results.

Currently, very few enterprises have invested in prescriptive analytics, but many analysts
believe this will be the next big area of investment after organizations begin experiencing
the benefits of predictive analytics.

HADOOP’S PARALLEL WORLD

HADOOP

Hadoop is an open source distributed processing framework that manages data processing
and storage for big data applications running in clustered systems. It is at the center of a
growing ecosystem of big data technologies that are primarily used to support advanced
analytics initiatives, including predictive analytics, data mining and machine learning
applications. Hadoop can handle various forms of structured and unstructured data,
giving users more flexibility for collecting, processing and analyzing data than relational
databases and data warehouses provide.

HADOOP AND BIG DATA

Hadoop runs on clusters of commodity servers and can scale up to support thousands of
hardware nodes and massive amounts of data. It uses a namesake distributed file system
that's designed to provide rapid data access across the nodes in a cluster, plus fault-
tolerant capabilities so applications can continue to run if individual nodes fail.
Consequently, Hadoop became a foundational data management platform for big data
analytics uses after it emerged in the mid-2000s.

HISTORY OF HADOOP

Hadoop was created by computer scientists Doug Cutting and Mike Cafarella, initially to
support processing in the Nutch open source search engine and web crawler. After
Google published technical papers detailing its Google File System (GFS) and

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
33

MapReduce programming framework in 2003 and 2004, Cutting and Cafarella modified
earlier technology plans and developed a Java-based MapReduce implementation and a
file system modeled on Google's.

In early 2006, those elements were split off from Nutch and became a separate Apache
subproject, which Cutting named Hadoop after his son's stuffed elephant. At the same
time, Cutting was hired by internet services company Yahoo, which became the first
production user of Hadoop later in 2006.

Use of the framework grew over the next few years, and three independent Hadoop
vendors were founded: Cloudera in 2008, MapR a year later and Hortonworks as a Yahoo
spinoff in 2011. In addition, AWS launched a Hadoop cloud service called Elastic
MapReduce in 2009. That was all before Apache released Hadoop 1.0.0, which became
available in December 2011 after a succession of 0.x releases.

HOW HADOOP WORKS AND ITS IMPORTANCE

Put simply: Hadoop has two main components. The first component, the Hadoop
Distributed File System, helps split the data, put it on different nodes, replicate it and
manage it. The second component, MapReduce, processes the data on each node in
parallel and calculates the results of the job. There is also a method to help manage the
data processing jobs.

Hadoop is important because:

It can store and process vast amounts of structured and unstructured data, quickly application
and data processing are protected against hardware failure. So if one node goes down, jobs
are redirected automatically to other nodes to ensure that the distributed computing doesn‘t fail.
The data doesn‘t have to be preprocessed before it‘s stored. Organizations can store as much
data as they want, including unstructured data, such as text, videos and images, and decide how
to use it later it‘s scalable so companies can add nodes to enable their systems to handle more
data.it can analyze data in real time to enable better decision making.

HADOOP APPLICATIONS

YARN greatly expanded the applications that Hadoop clusters can handle to include
stream processing and real-time analytics applications run in tandem with processing
engines, like Apache Spark and Apache Flink. For example, some manufacturers are

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
34
using real-time data that's streaming into Hadoop in predictive maintenance applications to try
to detect equipment failures before they occur. Fraud detection, website personalization and
customer experience scoring are other real-time use cases.

Because Hadoop can process and store such a wide assortment of data, it enables
organizations to set up data lakes as expansive reservoirs for incoming streams of
information. In a Hadoop data lake, raw data is often stored as is so data scientists and
other analysts can access the full data sets if need be; the data is then filtered and
prepared by analytics or IT teams as needed to support different applications.

Data lakes generally serve different purposes than traditional data warehouses that hold
cleansed sets of transaction data. But, in some cases, companies view their Hadoop data
lakes as modern-day data warehouses. Either way, the growing role of big data analytics
in business decision-making has made effective data governance and data security
processes a priority in data lake deployments.

Customer analytics -- examples include efforts to predict customer churn, analyze

clickstream data to better target online ads to web users, and track customer sentiment
based on comments about a company on social networks. Insurers use Hadoop for
applications such as analyzing policy pricing and managing safe driver discount
programs. Healthcare organizations look for ways to improve treatments and patient
outcomes with Hadoop's aid.

Risk management -- financial institutions use Hadoop clusters to develop more accurate
risk analysis models for their customers. Financial services companies can use Hadoop to
build and run applications to assess risk, build investment models and develop trading
algorithms.

Predictive maintenance -- with input from IoT devices feeding data into big data
programs, companies in the energy industry can use Hadoop-powered analytics to help
predict when equipment might fail to determine when maintenance should be performed.

Operational intelligence -- Hadoop can help telecommunications firms get a better

understanding of switching, frequency utilization and capacity use for capacity planning
and management. By analyzing how services are consumed as well as the bandwidth in
specific regions, they can determine the best places to locate new cell towers, for
example. In addition, by capturing and analyzing the data that‘s produced by the
infrastructure and by sensors, telcos can more quickly respond to problems in the
network.

Supply chain risk management -- manufacturing companies, for example, can track the
movement of goods and vehicles so they can determine the costs of various transportation
options. Using Hadoop, manufacturers can analyze large amounts of historical, time-
stamped location data as well as map out potential delays so they can optimize their
delivery routes.

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
35
BEST BIG DATA ANALYTICS TOOLS

Big Data Analytics software is widely used in providing meaningful analysis of a large
set of data. This software helps in finding current market trends, customer preferences,
and other information.

Here are the 8 Top Big Data Analytics Tools with key feature and download links.

1. Apache Hadoop
The long-standing champion in the field of Big Data processing, well-known for its
capabilities for huge-scale data processing. This open source Big Data framework can run
on-prem or in the cloud and has quite low hardware requirements. The main Hadoop
benefits and features are as follows:

 HDFS — Hadoop Distributed File System, oriented at working with huge-scale

bandwidth

 MapReduce — a highly configurable model for Big Data processing

 YARN — a resource scheduler for Hadoop resource management

 Hadoop Libraries — the needed glue for enabling third party modules to work
with Hadoop

2. Apache Spark
Apache Spark is the alternative — and in many aspects the successor — of Apache
Hadoop. Spark was built to address the shortcomings of Hadoop and it does this
incredibly well. For example, it can process both batch data and real-time data, and
operates 100 times faster than MapReduce. Spark provides the in-memory data
processing capabilities, which is way faster than disk processing leveraged by
MapReduce. In addition, Spark works with HDFS, OpenStack and Apache Cassandra,
both in the cloud and on-prem, adding another layer of versatility to big data operations
for your business.

3. Apache Storm
Storm is another Apache product, a real-time framework for data stream processing,
which supports any programming language. Storm scheduler balances the workload
between multiple nodes based on topology configuration and works well with Hadoop
HDFS. Apache Storm has the following benefits:

 Great horizontal scalability

 Built-in fault-tolerance

 Auto-restart on crashes
CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
36

 Clojure-written

 Works with Direct Acyclic Graph(DAG) topology

 Output files are in JSON format

4. Apache Cassandra
Apache Cassandra is one of the pillars behind Facebook‘s massive success, as it allows to
process structured data sets distributed across huge number of nodes across the globe. It
works well under heavy workloads due to its architecture without single points of failure
and boasts unique capabilities no other NoSQL or relational DB has, such as:

 Great liner scalability

 Simplicity of operations due to a simple query language used

 Constant replication across nodes

 Simple adding and removal of nodes from a running cluster

 High fault tolerance

 Built-in high-availability

5. MongoDB (https://www.guru99.com/mongodb-tutorials.html)
MongoDB is another great example of an open source NoSQL database with rich
features, which is cross-platform compatible with many programming languages. IT Svit
uses MongoDB in a variety of cloud computing and monitoring solutions, and we
specifically developed a module for automated MongoDB backups using Terraform. The
most prominent MongoDB features are:

 Stores any type of data, from text and integer to strings, arrays, dates and boolean

 Cloud-native deployment and great flexibility of configuration

 Data partitioning across multiple nodes and data centers

 Significant cost savings, as dynamic schemas enable data processing on the go

6. R Programming Environment
R is mostly used along with JuPyteR stack (Julia, Python, R) for enabling wide-scale
statistical analysis and data visualization. JupyteR Notebook is one of 4 most popular Big
Data visualization tools, as it allows composing literally any analytical model from more

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
37

than 9,000 CRAN (Comprehensive R Archive Network) algorithms and modules,

running it in a convenient environment, adjusting it on the go and inspecting the analysis
results at once. The main benefits of using R are as follows:

 R can run inside the SQL server

 R runs on both Windows and Linux servers

 R supports Apache Hadoop and Spark

 R is highly portable

 R easily scales from a single test machine to vast Hadoop data lakes

7. Neo4j
Neo4j is an open source graph database with interconnected node-relationship of data,
which follows the key-value pattern in storing data. IT Svit has recently built a resilient
AWS infrastructure with Neo4j for one of our customers and the database performs well
under heavy workload of network data and graph-related requests. Main Neo4j features
are as follows:

 Built-in support for ACID transactions

 Cypher graph query language

 High-availability and scalability

 Flexibility due to the absence of schemas

 Integration with other databases

8. Apache SAMOA
This is another of the Apache family of tools used for Big Data processing. Samoa
specializes at building distributed streaming algorithms for successful Big Data mining.
This tool is built with pluggable architecture and must be used atop other Apache
products like Apache Storm we mentioned earlier. Its other features used for Machine
Learning include the following:

 Clustering

 Classification

 Normalization

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
38

 Regression

 Programming primitives for building custom algorithms

Using Apache Samoa enables the distributed stream processing engines to provide such
tangible benefits:

 Program once, use anywhere

 Reuse the existing infrastructure for new projects

 No reboot or deployment downtime

 No need for backups or time-consuming updates

Final thoughts on the list of hot Big Data tools for 2018
Big Data industry and data science evolve rapidly and progressed a big deal lately, with
multiple Big Data projects and tools launched in 2017. This is one of the hottest IT trends
of 2018, along with IoT, blockchain, AI & ML.

PREDICTIVE ANALYTICS

 Predictive analytics is a form of advanced analytics that uses both new and
historical data to forecast activity, behavior and trends. It involves applying
statistical analysis techniques, analytical queries and automated machine learning
algorithms to data sets to create predictive models that place a numerical value --
or score -- on the likelihood of a particular event happening.
 Predictive analytics software applications use variables that can be measured and
analyzed to predict the likely behavior of individuals, machinery or other entities.
 For example, an insurance company is likely to take into account potential driving
safety variables, such as age, gender, location, type of vehicle and driving record,
when pricing and issuing auto insurance policies.
 Multiple variables are combined into a predictive model capable of assessing
future probabilities with an acceptable level of reliability. The software relies
heavily on advanced algorithms and methodologies, such as logistic regression
models, time series analysis and decision trees.
 Predictive analytics has grown in prominence alongside the emergence of big data
systems. As enterprises have amassed larger and broader pools of data in Hadoop
clusters and other big data platforms, they have created increased data mining
opportunities to gain predictive insights. Heightened development and
commercialization of machine learning tools by IT vendors has also helped
expand predictive analytics capabilities.

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
39
MOBILE BUSINESS INTELLIGENCE

Que. What does Mobile Business Intelligence (Mobile BI) mean?

Ans. Mobile business intelligence (mobile BI) refers to the ability to provide business
and data analytics services to mobile/handheld devices and/or remote users. MBI enables
users with limited computing capacity to use and receive the same or similar features,
capabilities and processes as those found in a desktop-based business intelligence
software solution.

 One of the major problems customers face when using mobile devices for
information retrieval is the fact that mobile BI is no longer as simple as the pure
display of BI content on a mobile device. Moreover, a mobile strategy has to be
defined to cope with different suppliers and systems as well as private phones.
Besides attempts to standardize with the same supplier, companies are also
concerned that solutions should have robust security features. These points have
led many to the conclusion that a proper concept and strategy must be in place
before supplying corporate information to mobile devices.
 The first major benefit is the ability for end users to access information in their
mobile BI system at any time and from any location. This enables them to get
data and analytics in ‗real time‘, which improves their daily operations and
means they can react more quickly to a wider range of events.

 The integration of mobile BI functions into operational business processes

increases the penetration of BI within organizations and often brings benefits in
the form of additional information.

 This speeds up the decision-making process by extending information and

reducing the time spent searching for relevant information. With this real-time
access to data, operational efficiency is improved and organizational
collaboration is enforced.

 Overall, mobile BI brings about greater availability of information, faster reaction

speed and more efficient working, as well as improving internal communication
and shortening workflows.
 Finally, with the provision of proper mobile applications to all mobile device
users, information can be used by people who previously did not use BI systems.
This in turn leads to a higher BI penetration rate within companies.

MOBILE BUSINESS INTELLIGENCE with BIG DATA

MBI works much like a standard BI software/solution but it is designed specifically for
handheld users. Typically, MBI requires a client end utility to be installed on mobile
devices, which remotely/wirelessly connect over the Internet or a mobile network to the
primary business intelligence application server. Upon connection, MBI users can

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
40

perform queries, and request and receive data. Similarly, clientless MBI solutions can be
accessed through a cloud server that provides Software as a Service business intelligence
(SaaS BI), Real-Time Business Intelligence (RTBI or Real-Time BI).

Mobile BI and Analytics

 Take Your Data Everywhere You Go
Modern business is not confined to the physical boundaries of the office.
The prevalence of mobile and handheld devices means you can take your work
anywhere, from the local cafe to an international flight.
 Never Lose Track of Your Business
Mobile BI lets you access your business intelligence dashboards anywhere
and at any time, using any mobile device. Going on a long business trip? Keep a
close eye on how the office is holding up by regularly checking on the status of
important KPIs using your smartphone.
 Fully Responsive, Mobile-Friendly Environment
Sisense Mobile BI App was designed according to the latest standards of
―mobile first‖. This means that everything was designed to work on any screen
and on any device — unlike many applications that provide a buggy mobile
interface as a last-minute afterthought
 Start Taking Your Data To-Go
Mobile data analytics software lets you take your data and your business
with you wherever you may be, freeing you from the constraints of your
workstation and making the world a truly smaller place

WHAT IS CROWDSOURCING?

Crowdsourcing is a term used to describe the process of getting work or funding

from a large group of people in an online setting. The basic concept behind this term is to
use a large group of people for their skills, ideas and participation to generate content or
help facilitate the creation of content or products.

In a sense, crowdsourcing is the distribution of problem solving. If a company needs

funding for a project, marketing content for an upcoming campaign or even research for a
new product, the crowd is a powerful resource capable of generating vast amounts of
money, content and information.

Crowdsourcing data collection consists in building data sets with the help of a large
group of people. There are a source and data suppliers who are willing to enrich the data
with relevant, missing, or new information.
This method originates from the scientific world. One of the first ever case of
crowdsourcing is the Oxford English Dictionary. The project aimed to list all the words
that enjoy any recognized lifespan in the standard English language with their definition
and explanation of usage. That was a gigantic task. So the dictionary creators invited the
crowd to help them on a voluntary basis.

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
Sounds familiar? Think no further than Wikipedia.

Wikipedia is a free, web-based, multilingual and collaborative encyclopedia built on a

non-for-profit business model. The platform has more than 100,000 active volunteer
contributors who add new knowledge to the system daily.

Another great example of crowdsourcing in practice is OpenStreetMap — an alternative

to GoogleMaps.

More than 1 million mappers work together to collect and supply data to OpenStreetMap
making it full of valuable information about the specified location.

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
THE IMPORTANCE OF CROWDSOURCING
The Internet is now a melting pot of user-generated content from blogs to Wikipedia
entries to YouTube videos. The distinction between producer and consumer is no longer
such a prevalent distinction as everyone is equipped with the tools needed to create as
well as consume.
As a business strategy, soliciting customer input isn‘t new, and open source software has
proven the productivity possible through a large group of individuals.
The history of crowdsourcing

While the idea behind crowdsourcing isn‘t new, its active use online as a business
building strategy has only been around since 2006. The phrase was initially coined by
Jeff Howe, where he described a world in which people outside of a company contribute
work toward that project‘s success. Video games have been utilizing crowdsourcing for
many years through their beta invitation. Granting players early access to the game,
studios request only that these passionate gamers report bugs and issues with gameplay as
they encounter before the finished product is released for sale and distribution.
Companies utilize crowdsourcing not only in a research and development capacity, but
also to simply get help from anyone for anything, whether it's word-of-mouth marketing,
creating content or giving feedback.

THE BENEFITS OF CROWDSOURCING

Crowdsourcing is a powerful business marketing tool as it allows an organization to
leverage the creativity and resources of its own audience in promoting and growing the
company for free. From designing marketing campaigns to researching new products to
solving difficult business roadblocks, an organization‘s consumers can likely provide
important guidance and answers. And, best of all, all the consumer wants in return for
their opinion and effort is some recognition or even a simple reward.

Crowdsourcing increases the productivity of a company while minimizing labor

expenses. The Internet is a time-proven strategy for soliciting feedback from an active
and passionate consumer base. Customers today want to be involved in the companies
they buy from, which makes crowdsourcing an incredibly effective tool.

THE DOWNSIDES OF CROWDSOURCING

At the same time, consumers aren‘t employees, which means organizations can‘t contain
or control them. Leveraging the interaction and resources of your audience can put an
organization at risk from a public relations standpoint as things can get ugly quickly
when not properly handled. Crowds may not ask for cash or free product, but they
will demand satisfaction in one form or another, whether it‘s recognition, freedom or
honesty.

INFORMATION MANAGEMENT

In term of US higher authorities, Information management was daily practice to manage

information i.e. to determine the nature of Information and what type of people can

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
access and read information, with whom it should share or not.
In 1970 when 4th generation computing was in its beginning phase, data scientist
developed various concept to determine the security of data (prevent it from unauthorized
access) and they create a concept of Object which was focused on data security rather
than logics. Before object there was entities called structures and unions that used to
manage data structure algorithms which were quite similar to object but it can‘t
encapsulate the behaviour like object is capable to encapsulate it‘s attribute as well as
behaviour. So, the concept of object orientation developed with first Object Oriented
Language i.e. SIMULA 67. But this concept grabs more attention when Bjarne Stroustrup
introduces the same concept with release of C++.
Need of Information Management in Web Applications
Web application of 20th century was not that secure like web applications of today. You
must hear that no one is perfect and it‘s also not possible to attain the perfection. Object
Oriented System is also not perfect system which has certain limitations. But they have
almost resolved the problem of data security in term of logics. Now it‘s too much hard to
access data if you‘re an unauthorized user.
In 1990s when Sabeer Bhatia created a web based email service called hotmail, people
think many times before using it, because the chance of leakage of information was in
extreme. That‘s why Object Oriented Approach was used in web languages like asp and
php to ensure the data security to create such application that can work with the self
established environment i.e. frameworks, a library of classes and functions to make
programming more easy. We needed to be secure because someone can use our
information to track us and use our information for illegal work. So, this concept was
almost used by every popular programming language.

Today’s Information Management

Information management is essential part of today‘s web development that ensures the
security of data that should be shared within the authorized users. This is not an easy task
to manage the whole thing, pass the authority to users and manage the privacy.
Facebook is the good example of Information system. Facebook is providing privacy to
its user which is the one of the capability that a perfect information system can do. It
passes authority to the graph nodes (i.e., your friends) connected with you to access the
information that is set for such circumstances.
Nobody can see your private information expect whom you pass authority to see your
private information in Facebook.

CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP
CS- 503 (A) Data Analytics Notes By –ARUN KUMAR JHAPATE, CSE, SIRT, BHOPAL, MP

Big Data Presentation Slide
100% (1)
Big Data Presentation Slide
30 pages
Big Data(1) [Autosaved]
No ratings yet
Big Data(1) [Autosaved]
13 pages
Understanding of Big Data
No ratings yet
Understanding of Big Data
25 pages
Slide 1 Big Data Introduction
No ratings yet
Slide 1 Big Data Introduction
88 pages
Unit 5
No ratings yet
Unit 5
68 pages
Unit 1
No ratings yet
Unit 1
54 pages
Big Data Analysis by deshbandhu
No ratings yet
Big Data Analysis by deshbandhu
368 pages
Bigdata
100% (1)
Bigdata
7 pages
Big Data ANALYSIS LONG
No ratings yet
Big Data ANALYSIS LONG
117 pages
Bigdatappt
No ratings yet
Bigdatappt
31 pages
Unit 1-BigDataTools
No ratings yet
Unit 1-BigDataTools
69 pages
Data Analytics
No ratings yet
Data Analytics
69 pages
Unit-1.1-Introduction To Big Data
No ratings yet
Unit-1.1-Introduction To Big Data
50 pages
Enterprise integration Report
No ratings yet
Enterprise integration Report
7 pages
Big Data Analytics_Lecture Slides
No ratings yet
Big Data Analytics_Lecture Slides
72 pages
BDA pptx
No ratings yet
BDA pptx
94 pages
Unit-1 Introduction to big data analytics
No ratings yet
Unit-1 Introduction to big data analytics
57 pages
Big data 2
No ratings yet
Big data 2
49 pages
05-Big Data
No ratings yet
05-Big Data
29 pages
Unit 1_BDS_DS307
No ratings yet
Unit 1_BDS_DS307
47 pages
Big-Data-A-Comprehensive-Overview
No ratings yet
Big-Data-A-Comprehensive-Overview
25 pages
BDA 01 - Introduction
No ratings yet
BDA 01 - Introduction
43 pages
Big Data Presentation
No ratings yet
Big Data Presentation
24 pages
Computer Networks TCP
No ratings yet
Computer Networks TCP
48 pages
BIG DATA_UNIT-I
No ratings yet
BIG DATA_UNIT-I
17 pages
Bda Unit1
No ratings yet
Bda Unit1
19 pages
Big Data Unit 1 Notes
No ratings yet
Big Data Unit 1 Notes
20 pages
Big Data PPT 55b0fc01e7543
No ratings yet
Big Data PPT 55b0fc01e7543
31 pages
Big-Data-ppt
No ratings yet
Big-Data-ppt
30 pages
Lecture 3-Introduction to Big Data
No ratings yet
Lecture 3-Introduction to Big Data
25 pages
BDA PST
No ratings yet
BDA PST
11 pages
UNIT-1:Overview of Big Data
No ratings yet
UNIT-1:Overview of Big Data
10 pages
G12 It Unit 2
No ratings yet
G12 It Unit 2
30 pages
Big Data
No ratings yet
Big Data
8 pages
Big Data
No ratings yet
Big Data
31 pages
Big Data Analytics
No ratings yet
Big Data Analytics
8 pages
Introduction To Bda
No ratings yet
Introduction To Bda
67 pages
PPT 1.1.4
No ratings yet
PPT 1.1.4
16 pages
Introduction To Big Data Platform
No ratings yet
Introduction To Big Data Platform
20 pages
Advanced Analytics: What Is Big Data Analytics? Definition, Benefits, and More
No ratings yet
Advanced Analytics: What Is Big Data Analytics? Definition, Benefits, and More
13 pages
Big Data technologies UNIT 1
No ratings yet
Big Data technologies UNIT 1
5 pages
Introduction To Big Data Computing
No ratings yet
Introduction To Big Data Computing
25 pages
Big Data Seminar Report Rahul Jain
No ratings yet
Big Data Seminar Report Rahul Jain
41 pages
Anand J. Kulkarn
No ratings yet
Anand J. Kulkarn
4 pages
Big Data
No ratings yet
Big Data
190 pages
117769
No ratings yet
117769
20 pages
Big Data Unit 1 Notes - 240311 - 100703
No ratings yet
Big Data Unit 1 Notes - 240311 - 100703
15 pages
BCC (IEEE Format) Big Data
No ratings yet
BCC (IEEE Format) Big Data
2 pages
Future Revolution On Big Data
No ratings yet
Future Revolution On Big Data
24 pages
Big Data Analytics
100% (3)
Big Data Analytics
79 pages
I Jcs It 20150605100
No ratings yet
I Jcs It 20150605100
4 pages
GDPR - Skillcast Presentation Template
100% (1)
GDPR - Skillcast Presentation Template
23 pages
Oracle 1Z0-931: Answer: A
No ratings yet
Oracle 1Z0-931: Answer: A
28 pages
Big Data Analytics
No ratings yet
Big Data Analytics
31 pages
Unit I-Ch 01-Big Data Introduction
No ratings yet
Unit I-Ch 01-Big Data Introduction
40 pages
DBMS List of Practicals AND MANUAL
No ratings yet
DBMS List of Practicals AND MANUAL
65 pages
MicroStrategy 8 - Advanced Reporting Guide
No ratings yet
MicroStrategy 8 - Advanced Reporting Guide
712 pages
Viva Questions Data Analyst
No ratings yet
Viva Questions Data Analyst
4 pages
Fundamentals of Big Data Engineering: A Guide To The
No ratings yet
Fundamentals of Big Data Engineering: A Guide To The
14 pages
Gowtham-p-4
No ratings yet
Gowtham-p-4
2 pages
How Too Fool Vag-Com
0% (1)
How Too Fool Vag-Com
1 page
SHANA KALLEM_Edgewater_QA Software Tester
No ratings yet
SHANA KALLEM_Edgewater_QA Software Tester
1 page
Al 11
No ratings yet
Al 11
25 pages
Annales Maconniques Dedies - Caillot
No ratings yet
Annales Maconniques Dedies - Caillot
519 pages
BCA 4th DBMS Lab Assignment 01 updated
No ratings yet
BCA 4th DBMS Lab Assignment 01 updated
4 pages
Mongo DB
No ratings yet
Mongo DB
28 pages
20181217182458D3408 Z0145 Session 1 ISYS6123 Introduction To Databases RevGanjil1819
No ratings yet
20181217182458D3408 Z0145 Session 1 ISYS6123 Introduction To Databases RevGanjil1819
53 pages
Big Data Technologies
No ratings yet
Big Data Technologies
4 pages
S12 B4H ADSOs+-+Part+1
No ratings yet
S12 B4H ADSOs+-+Part+1
12 pages
Computer Project Final
No ratings yet
Computer Project Final
38 pages
Decomposing SMACK Stack
No ratings yet
Decomposing SMACK Stack
62 pages
ExploringIBMDB2 EncryptionTechnology Marcin BasterIBM POLSKA
No ratings yet
ExploringIBMDB2 EncryptionTechnology Marcin BasterIBM POLSKA
84 pages
PivotTables in Excel
No ratings yet
PivotTables in Excel
4 pages
Java - Database Connectivity
No ratings yet
Java - Database Connectivity
3 pages
Mba Summer 2020
No ratings yet
Mba Summer 2020
2 pages
DSA PROJECT Word File
No ratings yet
DSA PROJECT Word File
4 pages
DBMS1
No ratings yet
DBMS1
7 pages
Error and Solution Ls Retail
No ratings yet
Error and Solution Ls Retail
10 pages
Instance - Checklist For Changing SID and Hostname
No ratings yet
Instance - Checklist For Changing SID and Hostname
19 pages
Rhel 5 6 7 8 Cheatsheet A3 0519
No ratings yet
Rhel 5 6 7 8 Cheatsheet A3 0519
2 pages
Jboss Eap Infinispan
No ratings yet
Jboss Eap Infinispan
2 pages
Teradata Architecture
No ratings yet
Teradata Architecture
3 pages
Hadoop Dev - HDFS Trash - Hadoop Dev
No ratings yet
Hadoop Dev - HDFS Trash - Hadoop Dev
5 pages
Kubernetes CKAD Hands-On Challenge #11 Security Contexts
No ratings yet
Kubernetes CKAD Hands-On Challenge #11 Security Contexts
11 pages
Homogeneous System Copy Using Online
No ratings yet
Homogeneous System Copy Using Online
5 pages
Big Data: the Revolution That Is Transforming Our Work, Market and World
From Everand
Big Data: the Revolution That Is Transforming Our Work, Market and World
PAT NAKAMOTO
No ratings yet
Understanding Big Data: A Beginners Guide to Data Science & the Business Applications
From Everand
Understanding Big Data: A Beginners Guide to Data Science & the Business Applications
Eileen McNulty-Holmes
4/5 (5)
Big Data: Opportunities and challenges
From Everand
Big Data: Opportunities and challenges
BCS, The Chartered Institute for IT
No ratings yet
Data Analytics with Python: Data Analytics in Python Using Pandas
From Everand
Data Analytics with Python: Data Analytics in Python Using Pandas
Frank Millstein
3/5 (1)
Hadoop BIG DATA Interview Questions You'll Most Likely Be Asked
From Everand
Hadoop BIG DATA Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

UNIT 2 Notes by ARUN JHAPATE

Uploaded by

UNIT 2 Notes by ARUN JHAPATE

Uploaded by

UNIT-2

What is Big Data? And Why Is It Important to Me?

Four V’s of Big Data

Introduction to Big Data Analytics

And it can help answer the following types of questions:

 What actually happened?

Big Data Analytics Applications:

 Big Data Applications in Healthcare

 Big Data Applications: Manufacturing

o Predicting what the audience wants

Big Data Applications: Internet of Things (IoT)

Big Data Applications: Government

1. The Hadoop Ecosystem

Traditional relational database management systems (RDBMSes) store information

MonboDB is one of several well-known NoSQL databases.

8. Big Data Security Solutions

9. Big Data Governance Solutions

11. Artificial Intelligence

12. Streaming analytics

13. Edge Computing

15. Prescriptive Analytics

HADOOP’S PARALLEL WORLD

HADOOP AND BIG DATA

HOW HADOOP WORKS AND ITS IMPORTANCE

Hadoop is important because:

Customer analytics -- examples include efforts to predict customer churn, analyze

Operational intelligence -- Hadoop can help telecommunications firms get a better

 HDFS — Hadoop Distributed File System, oriented at working with huge-scale

 MapReduce — a highly configurable model for Big Data processing

 YARN — a resource scheduler for Hadoop resource management

 Great horizontal scalability

 Works with Direct Acyclic Graph(DAG) topology

 Output files are in JSON format

 Great liner scalability

 Simplicity of operations due to a simple query language used

 Constant replication across nodes

 Simple adding and removal of nodes from a running cluster

 High fault tolerance

 Cloud-native deployment and great flexibility of configuration

 Data partitioning across multiple nodes and data centers

 Significant cost savings, as dynamic schemas enable data processing on the go

than 9,000 CRAN (Comprehensive R Archive Network) algorithms and modules,

 R can run inside the SQL server

 R runs on both Windows and Linux servers

 R supports Apache Hadoop and Spark

 Built-in support for ACID transactions

 Cypher graph query language

 High-availability and scalability

 Flexibility due to the absence of schemas

 Integration with other databases

 Programming primitives for building custom algorithms

 Program once, use anywhere

 Reuse the existing infrastructure for new projects

 No reboot or deployment downtime

 No need for backups or time-consuming updates

Que. What does Mobile Business Intelligence (Mobile BI) mean?

 The integration of mobile BI functions into operational business processes

 This speeds up the decision-making process by extending information and

 Overall, mobile BI brings about greater availability of information, faster reaction

MOBILE BUSINESS INTELLIGENCE with BIG DATA

Mobile BI and Analytics

Crowdsourcing is a term used to describe the process of getting work or funding

In a sense, crowdsourcing is the distribution of problem solving. If a company needs

Wikipedia is a free, web-based, multilingual and collaborative encyclopedia built on a

Another great example of crowdsourcing in practice is OpenStreetMap — an alternative

THE BENEFITS OF CROWDSOURCING

Crowdsourcing increases the productivity of a company while minimizing labor

THE DOWNSIDES OF CROWDSOURCING

In term of US higher authorities, Information management was daily practice to manage

Today’s Information Management

You might also like