0% found this document useful (0 votes)
3 views

PDF Data Science

Uploaded by

dwivediudit0105
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

PDF Data Science

Uploaded by

dwivediudit0105
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

DATA SCIENCE

Introduction of Data- Data refers to raw facts and figures that


are collected for analysis. It can come in various forms such as
numbers, text, images, etc. Artificial Intelligence has three broad
domains, namely, Data Science, Computer Vision, Natural
Language Processing.
Data Science-Data Science is an interdisciplinary field that
combines various techniques, algorithms, and processes to
extract meaningful insights from structured and unstructured
data. It uses tools from mathematics, statistics, computer
science, and domain knowledge to analyze and interpret data.
Computer Vision- It is a field of artificial intelligence (AI) and
computer science that enables computers and systems to
interpret and understand visual information from the world, such
as images, videos, or other visual inputs.
Natural Language Processing-(NLP) is a branch of artificial
intelligence (AI) and computational linguistics that focuses on the
interaction between computers and human (natural) languages.
The goal of NLP is to enable computers to understand, interpret,
and generate human language in a way that is both meaningful
and useful. This involves making sense of spoken or written
language, allowing machines to process and respond to human
communication.
Types of Data:
Structured Data: Organized data in rows and columns (e.g.,
spreadsheets or databases).
Unstructured Data: Data that does not have a pre-defined format
(e.g., social media posts, images, audio files).
Qualitative Data: This type of data is descriptive and categorical.
It can include attributes, labels, or categories that describe
characteristics, such as color, name, type, or preference.
Quantitative Data: Numerical data that can be measured or
counted (e.g., height, weight, age).
2.Data Collection and Sources:
Data can be collected from different sources, such as surveys,
sensors, websites, databases, and social media platforms.
Students learn how to gather and organize data in a systematic
manner.
3. Data Representation:
Text:
Number:
Images:
Video:
Audio:
Tabular Representation: Organizing data in tables or charts for
easier analysis.
Graphical Representation: Visual tools like bar graphs, pie charts,
histograms, and line graphs help in representing data in a way
that is easy to understand.
Charts and Graphs:
Bar Graphs: Used for comparing quantities.
Pie Charts: Used for showing proportions.
Histograms: Used for showing the frequency distribution of data.
Line Graphs: Used to show trends over time.
4. Data Analysis:
Descriptive Statistics: Analyzing data using mean, median, mode,
range, and standard deviation to summarize the data.
Data Cleaning: The process of removing or correcting errors in
the data to improve its quality.
Data Processing: Organizing and transforming raw data into a
form suitable for analysis.
5.Basic Probability and Statistics:
Understanding how to calculate probabilities and apply basic
statistical techniques to interpret data.
Using measures of central tendency (mean, median, mode) and
dispersion (range, variance, standard deviation).
6.Data Interpretation(explanation) and Decision Making:
Interpreting the results from the analysis and making data-driven
decisions.
Understanding patterns and trends in data that can lead to
informed conclusions or actions.
7.Applications of Data Science:
In Everyday Life: Examples include recommendations on
streaming services (like Netflix), shopping suggestions (Amazon),
and weather forecasts.
In Different Fields: Data science is used in various industries
such as healthcare (predicting diseases), finance (fraud
detection), education (personalized learning), and marketing
(customer behavior analysis).
8. Introduction to Programming:
Some basic programming concepts, such as using tools like
Python, R, or spreadsheets, to manipulate and analyze data, may
be introduced.
Teaching Methodology:
This can include: Hands-on activities, such as working with
spreadsheets or collecting data and analyzing it.
Simple coding exercises for understanding data analysis using
tools like Python.
Group discussions on the use of data science in real-world
problems.
OTHER APPLICATIONS OF DATA SCIENCE
Data Science is a field which is related to many real-life
application areas but most importantly in the area of Artificial
Intelligence. Data Science deals with collecting data, analyzing it
and then generating a machine learning algorithm for performing
tasks in a specific field.
1.Data Analysis in Sports: - In sports, data science is used to
analyze player performance, predict match outcomes, and
improve team strategies.
2. E-commerce and Online Shopping: - Online stores use data
science to recommend products to customers based on their
browsing history and past purchases.
Ex: - Amazon or Flipkart
3. Healthcare and Medical Diagnosis: - Data science is used to
analyze medical data such as X-rays, blood tests, and patient
histories to diagnose diseases and recommend treatments.
Ex: - In detecting diseases like cancer, data science algorithms
analyze medical images to find signs of tumors.
4. Social media - Platforms like Facebook, Instagram, and
Twitter use data science to analyze user behavior and
personalize the content shown to them.
Ex: - Data science helps determine which posts, ads, and videos
appear in a user's feed based on their interactions and interests.
5. Weather Forecasting- Data science helps meteorologists
predict weather by analyzing patterns in historical data, satellite
images, and sensor data.
Ex: - Predicting the weather for the upcoming days—temperature,
rainfall, etc.—based on large datasets of weather observations.
6. Traffic Management and Navigation - Data science is used
in traffic management systems and navigation apps like Google
Maps to analyze real-time traffic data and optimize travel routes.
Ex: - Google Maps uses data science to show traffic patterns,
suggest faster routes, and predict arrival times.
7. Banking and Finance - Data science is used by banks to
detect fraudulent transactions and assess loan eligibility based
on historical financial data.
Ex: - Banks analyze transaction data to flag any unusual or
fraudulent activities. They also use credit scoring models to
determine if someone qualifies for a loan.
8. Entertainment and Movies- Streaming services like Netflix
and YouTube use data science to recommend movies, TV shows,
and videos based on your watching history.
Ex: - When you watch a series on Netflix, it suggests other shows
or movies that you might like based on your previous choices.
9. Education and Personalized Learning- Data science is used
in education to analyze students' performance and tailor learning
materials to their needs, helping them improve.
Ex: - Online learning platforms like Khan Academy or Coursera
track students' progress and provide personalized
recommendations based on their performance.
10. Agriculture and Farming: - Data science is applied in
agriculture to predict crop yields, detect diseases, and optimize
farming practices.
Ex - Using data from weather patterns and soil conditions,
farmers can predict the best time to plant crops or detect early
signs of plant disease.
11. Government and Public Services- Governments use data
science to analyze public data for improving city infrastructure,
traffic management, crime prevention, and public health.
Ex- Smart cities use data science for better waste management,
water supply monitoring, and traffic regulation.

Revisiting the AI Project Cycle can help us better understand how


AI projects are developed, from identifying the problem to
deploying and maintaining the AI system. This cycle follows a
structured approach to ensure that AI solutions are built
effectively and meet the goals of the project.
1. Problem Definition- Understand the business or research
problem and define clear goals for the AI project. In Problem
solving, we study a problem and try to find a solution for this
problem. In this stage we work on 4W’s: Who, What, Where, why.
2. Data Collection- After finalizing the aim of our project, we need
to move towards looking at various data features which affect the
problem in some way or the other. AI-based project requires data
for testing and training, we need to understand what kind of data
is to be collected to work towards the aim.
3. Data Exploration- After putting the data in a database, we can
arrange it in a meaningful manner to extract the information from
it.
4. Model Selection and Training- Modelling refers to the process
of using statistical or computational techniques to make
predictions, understand patterns, or draw conclusions based on
data. The goal of modelling is usually to apply basic mathematical
or statistical models to solve simple problems. These models can
either predict numerical values or categorical outcomes based on
the data.
5. Model- In this stage, we make our dataset. Once this dataset is
ready, we train our model on it. The Model Development was
done at multiple levels to arrive at the most suitable model.
At first level we developed two sets of Model using Multi Linear
Regression (MLR). The first one with the actual available
variables. The second Model was developed using one additional
variable, i.e., Previous Day’s level for that particular Pollutant
(Dependent Variable).
Then at the second level we developed the Model using Neural
Network (NN). Once again this was further divided into two parts.
6. Evaluation- Evaluation refers to the process of assessing how
well a model performs after being trained on data.

You might also like