Basic Concepts in Big Data
Basic Concepts in Big Data
Private
Sector
Walmart handles more than 1 million customer transactions
every hour, which is imported into databases estimated to
contain more than 2.5 petabytes of data
Facebook handles 40 billion photos from its user base.
Falcon Credit Card Fraud Detection System protects 2.1 billion
active accounts world-wide
Science
Large Synoptic Survey Telescope will generate 140 Terabyte
of data every 5 days.
Biomedical computation like decoding human Genome &
personalized medicine
Social science revolution
-
Lifecycle
of
Data:
4
As
AggregaNon
Analysis
AcquisiNon
ApplicaNon
Data
Analysis
Data
Integra8on
Forma&ng,
Cleaning
Storage
Data
CS199
Data
Visualiza8on
Databases
Informa8on Retrieval
Data Access
Data Understanding
Machine Learning
Data
Analysis
Data
Mining
Data
Integra8on
Data
Warehousing
Forma&ng,
Cleaning
Signal
Processing
Storage
Informa8on Theory
Many Applica8ons!
Data
Predic8ve Modeling
Clustering
Example
of
Analysis:
Clustering
&
Latent
Factor
Analysis
Group
M1
Group U1
Group U2
Movie 1
Movie 2
User1
3.5
User2
Group M2
Movie
m
5
User
n
Group U1
Group U2
Movie 1
Movie 2
User1
3.5
User2
Group M2
Movie
m
5
=?
User
n