0% found this document useful (0 votes)
63 views

IICT - Data Science

The document discusses big data and data science. It defines big data as large amounts of data measured in terabytes or petabytes. Examples are given where big data is used, such as by sports teams to track strategies, hospitals analyzing patient data, and Netflix storing petabytes of video data. Data science is defined as using scientific methods and algorithms to extract knowledge and insights from complex data. Data science techniques like machine learning, data analysis, and statistics are used to gain business insights from data. Specific machine learning methods like supervised learning, unsupervised learning, and regression are also outlined.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

IICT - Data Science

The document discusses big data and data science. It defines big data as large amounts of data measured in terabytes or petabytes. Examples are given where big data is used, such as by sports teams to track strategies, hospitals analyzing patient data, and Netflix storing petabytes of video data. Data science is defined as using scientific methods and algorithms to extract knowledge and insights from complex data. Data science techniques like machine learning, data analysis, and statistics are used to gain business insights from data. Specific machine learning methods like supervised learning, unsupervised learning, and regression are also outlined.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 22

Data Science

IICT Lecture 11

FAST NUCES, Lahore


What is Big Data?
Lots of Data
(Terabytes or
Petabytes)

Enterprises/Companies
achieve related amount
of information and
business insights

FAST NUCES, Lahore


Big Data Scenarios

FAST NUCES, Lahore


CrickInfo: Sports

• Ticketing companies used


big data to track ticket
Sales
• Sports Teams are using
data for tracking team
strategies FAST NUCES, Lahore
Hospital Care

• Hospitals are analyzing medical data and patient records


to predict those patients that are likely to seek
readmission within a few months of discharge.
• Medical diagnostics company analyzes previous existing
data to develop non-intrusive test for predicting coronary
artery diseases, lung cancers, brain tumors and so on
FAST NUCES, Lahore
NetFlix

• Netflix, Inc. is an American media-services provider


headquartered in Los Gatos, California
• Uses 1 Petabyte to store the videos for streaming
(According to the Motion Picture Association of America, the number fluctuates, but the average seems to be
around 600 movies created in the US every year)

• 1 Petabyte of average MP3-Encoded songs (for mobile, roughly


one megabyte per minute), would require 2000 years to play
• Movies recommendation
The Large Hadron Collider
(CERN Franco-Swiss)

• The Large Hadron Collider (LHC) is the world's largest and most


powerful particle collider and the largest machine in the world.
• The LHC consists of a 27-kilometre ring of uperconducting magnets
with a number of accelerating structures to boost the energy of the
particles along the way.
• The experiments in the LHC produce about 15 petaytes of
data per year, which are distributed over the Worldwide
LHC Computing Grid

FAST NUCES, Lahore


And many more
• Predict Traffic given a time and location
• Facebook’s friends suggestions, Image
automatic annotations….etc.
• Real time soccer Analytics
•…

FAST NUCES, Lahore


What can Data do?

FAST NUCES, Lahore


FAST NUCES, Lahore
What is Data Science?
• Data science is a multifaceted field used to gain insights
from complex data. [MITx]
• Data science is an interdisciplinary field that
uses scientific methods, processes, algorithms and systems
to extract knowledge and insights from data in various
forms, both structured and unstructured. [Wikipedia]
• Data science is the study of where information comes
from, what it represents and how it can be turned into a
valuable resource in the creation of business and IT
strategies

FAST NUCES, Lahore


What can Data Science do for
business?
Business
Data Science! Extracting useful Data
information and knowledge
from large volumes of data in
order to improve business
decision-making or providing
the business insights to make
data-driven decisions

FAST NUCES, Lahore


Other applications of Data Science

FAST NUCES, Lahore


What Data Science Involve?
• Math and Statistics Knowledge
• Machine Learning
• Data Analysis
• Moreover, It employs techniques and
theories drawn from many fields within the
context of mathematics, statistics,
information science, and computer science.
FAST NUCES, Lahore
Machine Learning
•  Machine learning focuses on the
development of computer programs that
can access data and use it to learn for
themselves.
• Machine Learning programs are also
designed to learn and improve over time
when exposed to new data

FAST NUCES, Lahore


Supervised Learning
• Definition:
– Where a program is “trained” on a pre-defined
dataset. Based off its training data the program can
make accurate decisions when given new data.
• Business Application
– Classifying Twitter sentiments
– Recommender systems
Unsupervised Learning
• Definition
– Where a program, given a dataset, can automatically
find patterns and relationships within the dataset.
– Clustering or grouping of like data.
• Business Application
– Customer segmentation
– Understanding users and Behaviors
– Classifying unknown and predefined images into
categories
Regression
• Sub-category of Supervised Learning
• Regression is a type of algorithm that
predicts a continuous values

Plot prices with


relative to increase
in area
FAST NUCES, Lahore
Activity 1
• Go to following link
– http://www.shodor.org/interactivate/activities
/Regression/
– Plot the following data (next slide)
– Click on display line of best fit

FAST NUCES, Lahore


FAST NUCES, Lahore
Activity 2
• Go to following link
– https://playground.tensorflow.org
– Click on first option under “DATA”
– Make sure HIDDEN LAYERS value 2 (by using + and –
sign)
– Make sure at first layer value is 4 neurons and at
second layer value is 2 neurons
– Click on play button and wait for sometime
– Repeat for 3rd option under “DATA”
FAST NUCES, Lahore
End of the Lecture

FAST NUCES, Lahore

You might also like