0% found this document useful (0 votes)
49 views

Basic Concepts in Big Data

This document provides an introduction to basic concepts in big data. It defines big data as high-volume, high-velocity, and high-variety information assets that require new forms of processing. The document explains that big data is important for both government and private sector applications, as well as for science. It outlines the data lifecycle of acquisition, aggregation, analysis, and application. Computational and analytical views of big data are presented, along with some common data analysis techniques like visualization, classification, clustering, and predictive modeling. Examples of clustering and latent factor analysis as well as predictive modeling are given. Finally, related topics that will be covered are listed.

Uploaded by

yugandhar_ch
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Basic Concepts in Big Data

This document provides an introduction to basic concepts in big data. It defines big data as high-volume, high-velocity, and high-variety information assets that require new forms of processing. The document explains that big data is important for both government and private sector applications, as well as for science. It outlines the data lifecycle of acquisition, aggregation, analysis, and application. Computational and analytical views of big data are presented, along with some common data analysis techniques like visualization, classification, clustering, and predictive modeling. Examples of clustering and latent factor analysis as well as predictive modeling are given. Finally, related topics that will be covered are listed.

Uploaded by

yugandhar_ch
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Basic

Concepts in Big Data


ChengXiang (Cheng) Zhai
Department of Computer Science
University of Illinois at Urbana-Champaign
hBp://www.cs.uiuc.edu/homes/czhai
[email protected]

What is big data?


"Big Data are high-volume, high-velocity, and/or
high-variety informaNon assets that require new
forms of processing to enable enhanced decision
making, insight discovery and process
opNmizaNon (Gartner 2012)
Complicated (intelligent) analysis of data may
make a small data appear to be big
BoBom line: Any data that exceeds our current
capability of processing can be regarded as big

Why is big data a big deal?


Government

Obama administraNon announced big data iniNaNve


Many dierent big data programs launched

Private Sector
Walmart handles more than 1 million customer transactions
every hour, which is imported into databases estimated to
contain more than 2.5 petabytes of data
Facebook handles 40 billion photos from its user base.
Falcon Credit Card Fraud Detection System protects 2.1 billion
active accounts world-wide

Science
Large Synoptic Survey Telescope will generate 140 Terabyte
of data every 5 days.
Biomedical computation like decoding human Genome &
personalized medicine
Social science revolution
-

Lifecycle of Data: 4 As
AggregaNon

Analysis

AcquisiNon

ApplicaNon

ComputaNonal View of Big Data


Data Visualiza8on
Data Access
Data Understanding

Data Analysis
Data Integra8on

Forma&ng, Cleaning
Storage

Data

Big Data & Related Topics/Courses


Human-Computer Interac8on

CS199

Data Visualiza8on
Databases

Informa8on Retrieval

Data Access

Computer Vision Speech Recogni8on

Data Understanding

Natural Language Processing

Machine Learning

Data Analysis
Data Mining

Data Integra8on
Data Warehousing

Forma&ng, Cleaning
Signal Processing

Storage

Informa8on Theory

Many Applica8ons!

Data

Some Data Analysis Techniques


Visualiza8on
Classica8on
Time Series

Predic8ve Modeling

Clustering

Example of Analysis:
Clustering & Latent Factor Analysis
Group M1

Group U1

Group U2

Movie 1

Movie 2

User1

3.5

User2

Group M2

Movie m
5


User n

Example of Analysis: PredicNve Modeling


Group M1

Group U1

Group U2

Movie 1

Movie 2

User1

3.5

User2

Group M2

Movie m
5

=?


User n

Does user2 like movie m?


(Binary) Classica8on
What raNng is user2 likely going to give movie m?
Regression

Some topics well cover

You might also like