1. Data Science
1. Data Science
Tushar B. Kute,
http://tusharkute.com
Objectives
• Data Volume
• 44x increase from 2009 2020
• From 0.8 zettabytes to 35zb
Exponential increase in
collected/generated data
Computer Memory Units
Characteristics of Big Data: Variety
• Examples
• E-Promotions: Based on your current location, your purchase
history, what you like send promotions right now for store next to
you.
Old Model: Few companies are generating data, all others are
consuming data
• Relational Data
(Tables/Transaction/Legacy Data)
• Text Data (Web)
• Semi-structured Data (XML)
• Graph Data
• Social Network, Semantic Web (RDF), …
• Streaming Data
What to do with this data?
• Internet Search
• Digital Advertisements (Targeted Advertising and re-
targeting)
• Recommender Systems
• Image Recognition
• Speech Recognition
• Gaming
• Price Comparison Websites
• Airline Route Planning
• Fraud and Risk Detection
• Delivery logistics
Internet Search
Targeting Advertisement
Recommender System
Image Recognition
Speech Recognition
Computer Games
Price Comparison Website
Airline Route Planning
Fraud Detection
Delivery Logistics
Facets of Data
• Audio, image, and video are data types that pose specific
challenges to a data scientist.
• Tasks that are trivial for humans, such as recognizing
objects in pictures, turn out to be challenging for
computers. MLBAM (Major League Baseball Advanced
Media) announced in 2014 that they’ll increase video
capture to approximately 7 TB per game for the purpose
of live, in-game analytics.
• High-speed cameras at stadiums will capture ball and
athlete movements to calculate in real time, for example,
the path taken by a defender relative to two baselines.
Audio, Video and Image
Web Resources
https://mitu.co.in
http://tusharkute.com