0% found this document useful (0 votes)

3 views

Data Sciences

The document provides an overview of Artificial Intelligence, specifically focusing on Data Sciences, which is an interdisciplinary field that extracts knowledge from structured and unstructured data using various scientific methods. It discusses the relationship between Data Science and Machine Learning, types of analytics, applications in various domains such as finance and genetics, and the importance of data collection and visualization. Additionally, it outlines the structure and types of data, as well as basic statistical concepts relevant to data analysis.

Uploaded by

Nahid Noufal

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Data Sciences

Uploaded by

Nahid Noufal

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Artificial Intelligence

Data Sciences

AI can be classified into 3 domains

○ data sciences - working around numeric & alpha-numeric data
○ computer vision - working around image & visual data
○ natural language processing - working around textural & speech-based data

data sciences -
○ interdisciplinary field that involves extracting knowledge & insights from structures & unstructured data
using various scientific methods, processes, algorithms & tools
○ concept to unify statistics, data analysis, machine learning & their related methods to understand &
analyse actual phenomena with data
○ employs techniques & theories drawn from many fields within the context of Mathematics, Statistics,
Computer Science & Information Science

DS draws inspiration from several domains including

○ mathematics & science: DS relies on mathematical concepts &
statistical techniques to analyse & interpret data, identify patterns &
make predictions
○ computer science & programming: data scientists use programming
languages like Python / R to manipulate & process data efficiently. They
use algorithms & computational techniques to extract insights from
large datasets
○ data visualisation: involves creating graphical representations, charts,
graphs to communicate data. It presents complex & large datasets in a
visually appealing format, allowing analysts & stakeholders to gain
insights, identify patterns & make informed decisions based on
visualised data
○ domain expertise: DS applies methods to various domains like finance,
healthcare, marketing & social sciences

Machine Learning & Data Science

machine learning - branch of AI that focuses on developing algorithms & models that allow computers to learn
from data, make predictions & perform tasks without being explicitly programmed

ML relies on DS principles & techniques to analyse & interpret large datasets. Data scientists use ML algorithms to
uncover patterns, relationships & trends within the data. By training these algorithms on historical data, they enable
the computer to learn from examples & make predictions / decisions based on new, unseen data

Types of Analytics
DS relies on various types of analytics to extract meaningful information from complex datasets
○ descriptive analytics: examining & summarizing historical data to understand what happened in the past
ex. a marketing team analyses customer purchase data to gain insights into customer preferences, identify
popular products & understand sales trends

○ diagnostic analytics: focuses on analysing data to understand why certain events / outcomes occurred
ex. a manufacturing company may investigate data on production line performance to identify the root
cause of quality issues / machine breakdowns, using statistical analysis & root cause analysis techniques

○ predictive analytics: utilises historical data & statistical models to make predictions about future events /
outcomes
ex. an insurance company can analyse customer data, including demographics & historical claim records,
to build a predictive model that estimates the likelihood of future claim filing

○ prescriptive analytics: provides recommendations on what actions to take to achieve desired outcomes
ex. a supply chain management company uses optimization algorithms to determine the most efficient
routes for delivering goods, considering factors like delivery time, cost & available resources

Applications of Data Science

Data Sciences majorly work around analysing the data & the analysis helps in making the machine intelligent
enough to perform tasks by itself

Fraud & Risk Detection

The earliest applications of data science were in Finance. Companies, fed up with debts & losses, used the data
they collected during the initial paperwork while sanctioning loans. Banking companies learnt to divide & conquer
data via customer profiling, past expenditures & other essential variables to analyse the probabilities of risk &
default. It helped them push their banking products based on customer’s purchasing power

Genetics & Genomics

Data science enables an advanced level of treatment personalization through research in genetics & genomics to
understand the impact of DNA on our health & find individual biological connections between genetics, diseases &
drug response. DS techniques allow integration of different kinds of data with genomic data in disease research,
which provides a deeper understanding of genetic issues in reactions to particular drugs & diseases. Reliable
personal genome data helps achieve a deeper understanding of the human DNA. The advanced genetic risk
prediction will be a major step towards more individual care

Internet Search
Google, Yahoo, Bing, Ask, AOL make use of data science algorithms to deliver the best result for our searched query.
Google processes more than 20 petabytes of data every day with the help of DS
Targeted Advertising
The digital marketing spectrum relies on DS algorithms,, starting from the display banners on various websites to
the digital billboards at the airports, which is why digital ads have been able to get a much higher CTR
(Call-Through Rate) than traditional advertisements. They can be targeted based on a user’s past behaviour

Website Recommendations
Suggestions about similar products on Amazon help us find relevant products from billions of products available
with them & add to the user experience. A lot of companies have used this engine to promote their products in
accordance with the user’s interest & relevance of information. Internet giants like Amazon, Twitter, Google Play,
Netflix, LinkedIn, IMDB use recommendations made based on previous search results to improve the user experience

Airline Route Planning

The Airline Industry across the world is known to bear heavy losses. Except for a few airline service providers,
companies are struggling to maintain their occupancy ratio & operating profits. With high rise in air-fuel prices &
the need to offer heavy discounts to customers, the situation has got worse. Now, while using DS, airline companies
can:
○ predict flight delay
○ decide which class of airplanes to buy
○ whether to directly land at the destination or take a halt in between
○ effectively drive customer loyalty programs

Data Collection
data collection - involves gathering relevant data from various sources to support analysis & decision-making

Steps involved in data collection

1. identify data requirements:
○ clearly defining data requirements based on the objectives of the analysis or problem at hand
○ Determining the types of data needed (numerical, categorical or textual data) & any specific
variables of features of interest
2. identify data sources
○ databases: data stored in structured databases (ex. SQL databases) can be accessed using
appropriate query languages
○ Application Programming Interfaces: online platforms & services offer APIs that allow data retrieval
through specific endpoints
○ web scraping: data can be extracted from websites using web scraping techniques to scrape HTML
or get data from APIs
○ sensor data: sensors & IoT devices generate real-time data
○ surveys & questionnaires: collect specific information from individuals or groups
○ social media & online platforms: data from social media platforms, online forums or
user-generated content
○ publicly available datasets: institutions & organisations that provide publicly accessible datasets
for analysis

While accessing data from any of the data sources

○ only public data should be used
○ personal datasets should only be used with the consent of the owner
○ private data obtained by breaching privacy shouldn’t be used
○ data should only be taken from reliable sources
○ reliable sources of data ensure the authenticity of data
3. gather data: once data is collected, data scientists collect the required data by extracting, downloading or
accessing it from select sources
4. data preprocessing
○ processing gathered data to clean, transform & format it for further analysis
○ handles missing values, removed duplicates, standardises formats & performs necessary data
transformations
5. validate data quality: data quality is assessed by checking for accuracy, completeness, consistency &
relevance, which involves examining the data for any anomalies, errors or inconsistencies & taking
appropriate measures to address them
6. store & organise data: data is stored in a structured manner, in databases or data warehouses. Proper data
organisation, including creating tables, defining relationships & indexing allows for efficient data retrieval &
analysis

Data scientists must adhere to privacy regulations & ethical guidelines to protect sensitive information & ensure
responsible data handling practices

Structure of data
○ structured data:
✽ organized & well-defined
✽ stored in tables with rows & columns
✽ includes information like customer details, sales records & financial data found in databases &
spreadsheets
○ semi-structured data:
✽ doesn’t have strict structure but still has some organization
✽ may have tags or labels, like XML or JSON files
✽ ex. log files / social media feeds
○ unstructured data:
✽ lacks a predefined structure
✽ can be in various formats like text, images, audio or video
✽ includes social media posts, customer reviews, emails & multimedia content
✽ analysis requires specialized techniques like NLP or computer vision

Types of data
In DS & ML, the type of data being used influences which algorithms & techniques would be used to analyze the
data & make predictions. Different data types require different approaches to get accurate results
1. numerical data: consists of quantitative values that are expressed as numbers
○ continuous data:
✽ represents measurements that can take any value within a specific range
✽ can be represented as real numbers & often involve calculations & statistical analysis
✽ ex. temperature, height, weight & time
○ discrete data:
✽ represents values that are separate & distinct
✽ consists of whole numbers or counts, such as the number of products sold, the number of
people in a group or the number of votes received
✽ often used for counting, enumeration & categorical analysis
2. categorical data: non-numeric, qualitative or categorical variables that can take on a limited number of
distinct categories or labels
○ nominal data:
✽ represents categories that have no specific order or hierarchy
✽ ex. gender, eye colour or product categories
✽ typically used for grouping, classification & creating factors in statistical analysis
○ ordinal data:
✽ represents categories with a specific order or ranking
✽ ex. rating scales, customer satisfaction levels or high school levels
✽ retains the categorical nature byt also conveying relative differences in magnitude or
preference

Data Visualization
data visualisation -
○ represents data or information in a graph, chart or other visual formats
○ communicates information clearly & efficiently to users
○ provides a way to see & understand trends, outliers & patterns in data

common types of data visualizations

○ charts
○ graphs
○ tables
○ maps
○ histograms

Basic statistics with Python

mean: the average value of a sequence
○ add all numbers & divide the sum by the number of values

median: 50th percentile value of a sequence

○ if the number of data points is odd, the median is the middle data point in the list
○ if the number of data points is even, the median is the average of the 2 middle data points in the list

mode: most frequent value of the sequence

○ count how often each number appears & the number that appears the most times is the mode

standard deviation: measures the spread of the sequence around its average value
variance: average of the squared differences from the mean

EY SAP GRC Process Control PDF
No ratings yet
EY SAP GRC Process Control PDF
26 pages
Seminar On Data Science
100% (7)
Seminar On Data Science
25 pages
EY Managed SOC
100% (1)
EY Managed SOC
12 pages
The Field of Data Science
No ratings yet
The Field of Data Science
4 pages
Data Science CBSE Notes
No ratings yet
Data Science CBSE Notes
45 pages
5 - Data Analytics, Data Science and Machine Learning
No ratings yet
5 - Data Analytics, Data Science and Machine Learning
56 pages
Ch7-Overview of Data Science-part 1
No ratings yet
Ch7-Overview of Data Science-part 1
37 pages
Question Bank Syllbuswise
No ratings yet
Question Bank Syllbuswise
16 pages
Chapter-3 Data Sciences Study Materials Final-1
No ratings yet
Chapter-3 Data Sciences Study Materials Final-1
3 pages
himadev
No ratings yet
himadev
37 pages
Data science
No ratings yet
Data science
10 pages
DA-1,2,3[1]_merged
No ratings yet
DA-1,2,3[1]_merged
39 pages
Module 1
No ratings yet
Module 1
192 pages
Unit 1-FDS
No ratings yet
Unit 1-FDS
18 pages
Basics of Data Science KPK
No ratings yet
Basics of Data Science KPK
38 pages
Contact For The Course: - Instructor: Dr. Kauser Ahmed P
No ratings yet
Contact For The Course: - Instructor: Dr. Kauser Ahmed P
54 pages
Fds Module 1
No ratings yet
Fds Module 1
65 pages
Fundamentals of Data Science
100% (3)
Fundamentals of Data Science
62 pages
M-1-FDS-NOTES-PPT (2) (1)
No ratings yet
M-1-FDS-NOTES-PPT (2) (1)
19 pages
Inroduction To Data Science
No ratings yet
Inroduction To Data Science
62 pages
X AI SS CH4 LM
No ratings yet
X AI SS CH4 LM
57 pages
Data Science Intro Session-18 & 19
No ratings yet
Data Science Intro Session-18 & 19
48 pages
Chapter 1 Data Science Fundamentals
No ratings yet
Chapter 1 Data Science Fundamentals
34 pages
Introduction To Datasciecne
No ratings yet
Introduction To Datasciecne
50 pages
DS Module 1
No ratings yet
DS Module 1
112 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
30 pages
Unit 2 Data Science
No ratings yet
Unit 2 Data Science
53 pages
Data Science Introduction
No ratings yet
Data Science Introduction
22 pages
02 Introduction_Fall 23-24
No ratings yet
02 Introduction_Fall 23-24
29 pages
Datascience
75% (8)
Datascience
28 pages
Data Science Tutorial 1
No ratings yet
Data Science Tutorial 1
26 pages
DATA SCIENCE LIFE CYCLE
No ratings yet
DATA SCIENCE LIFE CYCLE
12 pages
BCA Lecture I
No ratings yet
BCA Lecture I
20 pages
Fd45092a Ccad 459e Bc18 b01536fd6bac Untitled
No ratings yet
Fd45092a Ccad 459e Bc18 b01536fd6bac Untitled
53 pages
Data Analytics Introduction
No ratings yet
Data Analytics Introduction
9 pages
Data Science PPT-2
No ratings yet
Data Science PPT-2
34 pages
DS-Unit-1_ABM
No ratings yet
DS-Unit-1_ABM
103 pages
Cmr Bda Why Data Analytics
No ratings yet
Cmr Bda Why Data Analytics
108 pages
Ab Assignment 3
No ratings yet
Ab Assignment 3
7 pages
Introduction Am
No ratings yet
Introduction Am
74 pages
Data Science_ppt
No ratings yet
Data Science_ppt
45 pages
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
33 pages
Foundations of Data Science PPT TEXT BOOK
No ratings yet
Foundations of Data Science PPT TEXT BOOK
132 pages
DS3 Data Science Introduction
No ratings yet
DS3 Data Science Introduction
18 pages
Big Data For Dummies
No ratings yet
Big Data For Dummies
8 pages
Fundamentals of Data Science
No ratings yet
Fundamentals of Data Science
53 pages
Chapter one-DSA
No ratings yet
Chapter one-DSA
20 pages
Introduction To Data Science What Is Data Science?
No ratings yet
Introduction To Data Science What Is Data Science?
11 pages
(DSBDA) Unit 1 Introduction To Data Science
No ratings yet
(DSBDA) Unit 1 Introduction To Data Science
14 pages
data science chacha
No ratings yet
data science chacha
150 pages
Kadir
No ratings yet
Kadir
84 pages
Applied Data Science Career Guide
No ratings yet
Applied Data Science Career Guide
15 pages
Glossary
No ratings yet
Glossary
50 pages
Unit I & II_FDS_II AI&DS
No ratings yet
Unit I & II_FDS_II AI&DS
48 pages
Machine Learning Unit-1.1
No ratings yet
Machine Learning Unit-1.1
29 pages
Day 1 Intro To DS and ML - New
No ratings yet
Day 1 Intro To DS and ML - New
41 pages
CSE3038 Module 1
No ratings yet
CSE3038 Module 1
21 pages
Impact of Artificial Intelligence on the Software Industries (2)
No ratings yet
Impact of Artificial Intelligence on the Software Industries (2)
25 pages
Unit 1
No ratings yet
Unit 1
28 pages
DS QB unit 1
No ratings yet
DS QB unit 1
45 pages
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Data Science
From Everand
Data Science
Chloe Martin
No ratings yet
Lecture 6-Text Mining and Sentiment Analysis
No ratings yet
Lecture 6-Text Mining and Sentiment Analysis
57 pages
(eBook PDF) Cost Management: A Strategic Emphasis 9th Edition pdf download
100% (1)
(eBook PDF) Cost Management: A Strategic Emphasis 9th Edition pdf download
51 pages
Remote Audit: A Review of Audit-Enhancing Information and Communication Technology Literature
No ratings yet
Remote Audit: A Review of Audit-Enhancing Information and Communication Technology Literature
23 pages
DAA - Chapter 03
No ratings yet
DAA - Chapter 03
18 pages
Data Analytics and Visualization in Quality Analysis using Tableau 1st Edition Hwang Jaejin Yoon Youngjin - The full ebook with all chapters is available for download
100% (1)
Data Analytics and Visualization in Quality Analysis using Tableau 1st Edition Hwang Jaejin Yoon Youngjin - The full ebook with all chapters is available for download
80 pages
Sop
No ratings yet
Sop
1 page
IBV - Making Change Work - . - While The Work Keeps Changing
No ratings yet
IBV - Making Change Work - . - While The Work Keeps Changing
24 pages
AUCR2021
No ratings yet
AUCR2021
25 pages
BOB Hackathon Call Center Analytics 20220919 v2
No ratings yet
BOB Hackathon Call Center Analytics 20220919 v2
13 pages
Final Report On Big Data and Advanced Analytics PDF
No ratings yet
Final Report On Big Data and Advanced Analytics PDF
60 pages
PHD 320 Marketing Research and Analytics Syllabus - 20201124
No ratings yet
PHD 320 Marketing Research and Analytics Syllabus - 20201124
8 pages
Folien Der Wahlpflichtfaechervorstellung WS201819 PDF
No ratings yet
Folien Der Wahlpflichtfaechervorstellung WS201819 PDF
119 pages
Using ai in lean six Sigma projects
No ratings yet
Using ai in lean six Sigma projects
9 pages
Enterprise-Resource-Planning-ERP-for-Colleges (1)
No ratings yet
Enterprise-Resource-Planning-ERP-for-Colleges (1)
10 pages
E-Grocery Synopsis
No ratings yet
E-Grocery Synopsis
3 pages
CRM Notes (Word)
No ratings yet
CRM Notes (Word)
27 pages
7 Marketing KPIs You Should Know & How To Measure Them VI
No ratings yet
7 Marketing KPIs You Should Know & How To Measure Them VI
6 pages
Team Name: Tqmanagers Campus: Iim Indore Arti Modak Shubham Sharma Shantal Raj
No ratings yet
Team Name: Tqmanagers Campus: Iim Indore Arti Modak Shubham Sharma Shantal Raj
10 pages
Tom Schultz,: D M - B S - W A - SEO
No ratings yet
Tom Schultz,: D M - B S - W A - SEO
1 page
Aniket PDF
No ratings yet
Aniket PDF
4 pages
BMDE Business Model Canvas
No ratings yet
BMDE Business Model Canvas
2 pages
NITT MBA Placement Brochure 2020
No ratings yet
NITT MBA Placement Brochure 2020
8 pages
Best of KNIME Ebook
No ratings yet
Best of KNIME Ebook
257 pages
Explore Signaling Security Network - White Paper - Ericsson
No ratings yet
Explore Signaling Security Network - White Paper - Ericsson
8 pages
PL 300T00A ENU Powerpoint01
No ratings yet
PL 300T00A ENU Powerpoint01
20 pages
ON Demand App Development PDF
100% (2)
ON Demand App Development PDF
5 pages
What Are The Applications of AI?
No ratings yet
What Are The Applications of AI?
24 pages
Resume - Ammar Mohiyadeen - Riyadh - Docx-1
No ratings yet
Resume - Ammar Mohiyadeen - Riyadh - Docx-1
3 pages

Data Sciences

Uploaded by

Data Sciences

Uploaded by

Artificial Intelligence

AI can be classified into 3 domains

DS draws inspiration from several domains including

Machine Learning & Data Science

Applications of Data Science

Fraud & Risk Detection

Genetics & Genomics

Airline Route Planning

Steps involved in data collection

While accessing data from any of the data sources

common types of data visualizations

Basic statistics with Python

median: 50th percentile value of a sequence

mode: most frequent value of the sequence

You might also like