0% found this document useful (0 votes)
59 views28 pages

2 LecturE 1 2

This document provides an introduction to big data, including its key characteristics, opportunities, and challenges. It discusses the 3 Vs, 5 Vs, and 8 Vs frameworks used to define big data. Examples are given of sources of modern data like social media, IoT devices, and digital photos. Big data opportunities include better decision making and cost reduction through analytics. Challenges include dealing with huge data volumes, variety, and velocity as well as timely analytics and lack of experts. Coca Cola is presented as using big data to optimize supply chains and understand customer behavior.

Uploaded by

Rebecca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views28 pages

2 LecturE 1 2

This document provides an introduction to big data, including its key characteristics, opportunities, and challenges. It discusses the 3 Vs, 5 Vs, and 8 Vs frameworks used to define big data. Examples are given of sources of modern data like social media, IoT devices, and digital photos. Big data opportunities include better decision making and cost reduction through analytics. Challenges include dealing with huge data volumes, variety, and velocity as well as timely analytics and lack of experts. Coca Cola is presented as using big data to optimize supply chains and understand customer behavior.

Uploaded by

Rebecca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

BIG DATA ANALYTICS

SEN-332

PART-1
INTRODUCTION TO BIG DATA
Lecture Outline & Objectives
– Modern World Data
• Characteristics
• Sources
• Examples
– Big Data
• 3 Vs
• 5 Vs
• 8 Vs
– Big Data Opportunities
– Big Data Challenges
– Big Data Examples

Objectives: Provide fundamental information to get insight into the


challenges/ opportunities with big data.
Modern World Data
• The concept of data in modern world has
changed.
• It has moved beyond collection of facts &
figures
• Importance of modern world data can be felt by
imagining our lives without data…
Characteristics of Modern World Data
• Speed
• Variety
• Volume
• Value
• Visualization & Modeling
• Veracity
Sources of Modern World Data
• Social Media
• Satellites
• Newspapers
• News Channels
• Research Papers
• Hospitals
• Education Sector
• Cell Phones
• Artificial Agents
• Many Many More..
Example of Modern World Data
• Internet
– We conduct more than half of our web searches from a
mobile phone now.

– More than 3.7 billion humans use the internet (that’s a


growth rate of 7.5 percent over 2016).

– On average, Google now processes more than 40,000


searches EVERY second (3.5 billion searches per day)!
While 77% of searches are conducted on Google, it would
be remiss not to remember other search engines are also
contributing to our daily data generation. Worldwide there
are 5 billion searches a day.
Example of Modern World Data
• Social Media
– Snapchat users share 527,760 photos (Every Minute)
– More than 120 professionals join LinkedIn (Every Minute)
– Users watch 4,146,600 YouTube videos (Every Minute)
– 456,000 tweets are sent on Twitter (Every Minute)
– Instagram users post 46,740 photos (Every Minute)
– 1.5 billion people are active on Facebook daily
– Europe has more than 307 million people on Facebook
– There are five new Facebook profiles created every second!
– More than 300 million photos get uploaded per day
– Every minute there are 510,000 comments posted and 293,000 statuses
updated
– There are 600 million Instagrammers; 400 million who are active every day
– Each day 95 million photos and videos are shared on Instagram
– 100 million people use the Instagram “stories” feature daily
Example of Modern World Data
• Communication
– We send 16 million text messages
– There are 990,000 Tinder swipes
– 156 million emails are sent; worldwide it is
expected that there will be 9 billion email users by
2019
– 15,000 GIFs are sent via Facebook messenger
– Every minute there are 103,447,520 spam emails
sent
– There are 154,200 calls on Skype
Example of Modern World Data
• Digital Photos
– People will take 1.2 trillion photos by the end of
2017
– There will be 4.7 trillion photos stored
Example of Modern World Data
• Services
– The Weather Channel receives 18,055,556
forecast requests
– Venmo processes $51,892 peer-to-peer
transactions
– Spotify adds 13 new songs
– Uber riders take 45,788 trips!
– There are 600 new page edits to Wikipedia
Example of Modern World Data
• Internet of Things
– The Internet of Things, connected “smart” devices
that interact with each other and us while collecting
all kinds of data, is exploding (from 2 billion devices in
2006 to a projected 200 billion by 2020) and is one of
the primary drivers for our data vaults exploding as
well.
– There are 33 million voice-first devices in circulation
– 8 million people use voice control each month
– Voice search queries in Google for 2016 were up 35
times over 2008
Example of Modern World Data
• Just imagine..
– How much data you generate daily.
– How much data students of this class generates daily.
– How much data students of this department
generates daily.
– How much data students this university generates
daily.
– How much data all students of Pakistan generate daily.
– And finally how much data students of entire world
generate daily.
– And this is only students…
Conclusions So Far
• Modern world data has to be treated differently
beyond the boundaries of traditional data.
Everything about traditional data from its name
to its management has changed.
• This is what we will study in this course from
here onwards. What are techniques and tools
to manage modern world data and to get useful
information by its processing.
What is Big Data
• Can we call data of NADRA as Big Data?
• Can a Data of over 2000 TBs be called Big
Data?
• What you think should be a size of data to be
called Big Data?
• Does Big Data only relates to volume of data?
What is Big Data
• Big data is data that contains greater variety
arriving in increasing volumes and with ever-
higher velocity. This is known as the three Vs.
• Put simply, big data is larger, more complex data
sets, especially from new data sources. These
data sets are so voluminous that traditional data
processing software just can’t manage them. But
these massive volumes of data can be used to
address business problems you wouldn’t have
been able to tackle before.
The Three Vs of Big Data
• Volume
The amount of data matters. For data to be labeled as big data, it
should be of enormous volume. There is no exact figure
available to quantify this characteristic of data. But as a thumb
rule it can be taken as data volumes which traditional data base
technologies and file systems can not manage efficiently.
• Velocity
The Velocity of data matters. For data to be labeled as big data, it
should be coming in or generated at exponential rates.
• Variety
The Variety of data also matters. For data to be labeled as big
data, it should have variety of types and formats. Like, Text Data,
Videos, images, pdfs, txt, json, xml, csv etc.
Initially, data was called Big Data if it had all three characteristics..
The Five Vs of Big Data
• First three being Volume, Velocity and Variety.
• Value
For a data to be called big data it must have a
value. Value of data means that data produces
useful information/ insights.
• Veracity
For a data to be big data it must have characteristic
of veracity. Veracity of data tells about worthiness,
reliability and accuracy of the content.
The Eight Vs of Big Data
Some more Definitions of Big Data
• Big Data is a broad term for data sets so large or
complex that they are difficult to process using
traditional data processing applications. Challenges
include analysis, capture, curation, search, sharing,
storage, transfer, visualization, and information privacy.
(Wikipedia)
• Big data is high-volume, high-velocity and high-variety
information assets that demand cost-effective,
innovative forms of information processing for
enhanced insight and decision making.
(Gartner)
Conclusion about Big Data
• So many definitions and Vs.
• To make it simple, lets stick to following:
– Data that traditional data management tools and
techniques can not handle effectively.
– The very first Vs are important, Variety, Velocity
and Volume.
• All other Vs are important in various different
contexts.
Big Data Opportunities
• Big Data has opened great opportunity for businesses/
industries/ organizations world wide to understand
different dimensions and factions of themselves and
stakeholders associated with them. This leads to following
major achievements:
– Better Decision Making
– Improved Stake Holder Management
– Cost Reduction
– Optimum Solutions
– Timely Availability of Resources
– And Many More
• Big data opportunities can be described by Big Data Analytics.
We will discuss about this in detail in next lecture.
Big Data Challenges
• Dealing Huge Volume of Data
– Storage Availability
– Computational Power
– Monetary Concerns
– Data Backups
– Failure Tolerance
– Database Technology
– File Systems
• Dealing with Exponentially increasing Data
– Storage Scalability
– Computational Scalability
• Dealing with Great Variety of Data
– Schema Management
– Data Integration
– Application Level Management
Big Data Challenges
• Getting In-Sights in timely manner
– Timely Analytics
– Changing Trends
– Data Mining
• Relatively new Technology
– Lack of Experts
– Still Developing
– Changing Technologies
• Validating data
– Non-availibility of reference
– Huge Input Sources
– Computations required in validations
Big Data Challenges
• Data Security
– Various Input Sources
– Varied Data Types
– Computational Complexities
• Organizational resistance
– Status Quo
– Training
– Relatively Difficult
– Completely New Technology
Big Data Examples
• Coca Cola
Coca Cola has been in a leader in the consumer packaged goods industry for over a
century, and their brands are iconic. They distribute their products to a global network
of retailers, have many SKU’s, and must be able to predict buyer behavior to ensure they
have the right inventory, promotional ads in the marketplace and sponsoring the right
events worldwide.
• Coca Cola has been able to get wins with Big Data analytics by:
• Selecting the ideal ingredient mix to produce juice products
• Create efficiencies in their warehousing, restaurant and retail supply chain
operations
• Mining loyalty program, competitive, POS and social media data to understand
buyer behavior
• Creating digital service centers for procurement and HR processes
• Leverage a new breed of storage media to retain, process and analyze vast amounts
of information
Coca Cola’s customers are in 206 countries, a vastly diverse marketplace with tens of
millions of ultimate consumers. Effectively managing the information relating to their
clients, employees, suppliers and media assets requires effective storage, powerful
indexing and search functionality, and innovative solutions to make sure information can
be located and used when required. Big Data solutions have provided Coca Cola with
this ability.
Big Data Examples
• Netflix
To make sure its clients keep watching its programming, Netflix is constantly analyzing trends in:
• Program viewership
• Trends in the content its customers are consuming
• The colors of the promotional visuals of its programming
• Devices its clients are watching its programming on
• Whether a viewer watches a portion of a movie, a season of a
series, or a complete series back to back in a weekend binge
watching session
For many entertainment, technology and media organizations, Big Data analytics is the key to
retaining subscribers, securing advertising revenues, and understanding the sort of content
to serve as it relates to geographical locations, time of day, demographics, and on opinions
expressed on social media. Big Data gives Nexflix the ability to deliver the content the
customer wants to see, when the customer wants it.
Big Data More Examples
• Some companies that use Big Data are:
– Facebook
– Amazon
– Yahoo
– Google
– Twitter
– American Express
Conclusion
• We are living in the ocean of data where more and more
different types of data is being generated at enormous
speed.
• The Data surrounding us provides opportunity to learn and
evaluate great insights which were not possible before.
• Managing and utilizing this data is challenge.
• This data is called Big Data which is relatively a newer
concept.
• Great amount of Data was present previously as well but
lacked computational power/ techniques to get insights.
• With advent of Computational power/ techniques world
has become more data centric and more and more data is
being generated and analyzed to get useful insights.

You might also like