0% found this document useful (0 votes)
57 views

Basic Concepts in Big Data

- This document defines big data and explains why it is important. Big data refers to large volumes of data that require new techniques to analyze and extract value from. Examples are given of big data in government, private sector, and science. - The lifecycle of data involves acquiring, aggregating, analyzing and applying the data through knowledge. Computational views involve data access, analysis, understanding, integration and storage. - Related topics that will be covered include machine learning, data mining, visualization, clustering, predictive modeling, classification, recommendation systems and more. Examples are given of clustering users and movies, as well as predictive modeling to estimate user ratings.

Uploaded by

dineshgomber
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Basic Concepts in Big Data

- This document defines big data and explains why it is important. Big data refers to large volumes of data that require new techniques to analyze and extract value from. Examples are given of big data in government, private sector, and science. - The lifecycle of data involves acquiring, aggregating, analyzing and applying the data through knowledge. Computational views involve data access, analysis, understanding, integration and storage. - Related topics that will be covered include machine learning, data mining, visualization, clustering, predictive modeling, classification, recommendation systems and more. Examples are given of clustering users and movies, as well as predictive modeling to estimate user ratings.

Uploaded by

dineshgomber
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Basic Concepts in Big Data

ChengXiang (Cheng) Zhai


Department of Computer Science
University of Illinois at Urbana-Champaign
http://www.cs.uiuc.edu/homes/czhai
[email protected]
What is big data?

"Big Data are high-volume, high-


velocity, and/or high-variety information
assets that require new forms of
processing to enable enhanced decision
making, insight discovery and process
optimization (Gartner 2012)
Complicated (intelligent) analysis of
data may make a small data appear
to be big
Bottom line: Any data that exceeds our
Why is big data a big deal?
Government
Obama administration announced big data initiative
Many different big data programs launched
Private Sector
Walmart handles more than 1 million customer transactions every hour,
which is imported into databases estimated to contain more than 2.5
petabytes of data
Facebook handles 40 billion photos from its user base.
Falcon Credit Card Fraud Detection System protects 2.1 billion active
accounts world-wide
Science
Large Synoptic Survey Telescope will generate 140 Terabyte of data every
5 days.
Biomedical computation like decoding human Genome & personalized
medicine
Social science revolution
-
Lifecycle of Data: 4 As
In
ed D te
er Aggregatio a g
att ta rat
c
S ta n ed
Da
Acquisition Analysis
g e
Log ed
da l
ta ow
Application Kn
Computational View of Big Data

Data
Visualization
Data Access Data Analysis

Data Understanding Data Integration

Formatting, Cleaning

Storage Data
Big Data & Related Topics/Courses
CS19
Human-Computer Interaction
9
Data
Visualization Machine Learning
DatabasesInformation Retrieval
Data Access Data Analysis
Data Mining
Computer Vision
Speech Recognition
Data Understanding Data Integration
Natural Language ProcessingData Warehousing

Formatting, Cleaning
Signal Processing
Many
Storage Applications!
Data
Information Theory
Some Data Analysis Techniques

Visualizat
ion
Classificati Predictive
on Modeling
Time Clusteri
Series ng
Example of Analysis:
Clustering & Latent Factor Analysis

Group M1 Group M2

Movie 1 Movie 2 Movie


m
Group U1 User1 3.5 4 5

User2 5 1


Group U2

User n 2 1 4
Example of Analysis: Predictive Modeling
Group M1 Group M2

Movie 1 Movie 2 Movie


m
Group U1 User1 3.5 4 5

User2 5 1
=?

Group U2

User n 2 1 4

Does user2 like movie m? (Binary) Classification


What rating is user2 likely going to give movie m?
Regression
Some topics well cover

You might also like