0% found this document useful (0 votes)
19 views

Copy of Introduction to DS.pdf

Uploaded by

Ashwin chaudhari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Copy of Introduction to DS.pdf

Uploaded by

Ashwin chaudhari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

INTRODUCTION

TO
DATA SCIENCE
AND
MACHINE
LEARNING
DEFINING DATA
DATA SCIENCE AND ITS
IMPORTANCE
“Information is the oil of the 21st century, and
analytics is the combustion engine.” – Peter
Sondergaard, SVP, Garner Research
Data science became a buzzword when the Harvard
Business Review called it “The Sexiest Job of the 21 st
Century,” it became a buzzword. Because of this, it
tends to be used to describe predictive modeling,
business intelligence, business analytics, or other uses
of data, or to make statistics sound more interesting.
“Hiding within those mounds of data is knowledge that
could change the life of a patient or change the world.”
– Atul Butte, Stanford Quickly progressing
WHAT IS DATA SCIENCE
Data Science is a multidisciplinary
field that uses scientific methods,
algorithms, processes, and
systems to extract knowledge and
insights from structured and
unstructured data.

It combines expertise from


various domains such as
statistics, computer science,
domain knowledge, and data
engineering to analyze and
interpret complex data.
Mathematics is a field of study that deals with the
properties, relationships, and operations of numbers,
quantities, shapes, patterns, structures, and change

Statistics : Way of making sense of collection of


observations , It aims particularly to help us avoid
jumping to conclusions. it reminds us to be cautious
about the extent to which we can generalize
Probability : likelihood, or weighing up the chances
AI VS ML VS DL
AI VS ML VS DL

Artificial Intelligence
- Human Intelligence Exhibited by
Machines
Machine Learning
An Approach to Achieve Artificial
Intelligence
Deep Learning
-A Technique for Implementing Machine
Learning
DATA ANALYSIS LIFE CYCLE
DATA SCIENCE LIFE CYCLE
BENEFITS OF DATA SCIENCE FOR
A BUSINESS
• It will monetize data

• It will mitigate company risk

• It will help a company get a better understanding of their


customers

• It will give businesses unique insights

• It will help with business expansion

• It will improve forecasting

• It will provide businesses with objective decisions


WHAT YOU NEED FOR DATA
SCIENCE
When it comes to data science, there are
three major skill areas that are blended
together.

1. Mathematics expertise
2. Technology; hacking skills
3. Business/strategy acumen
REAL-WORLD APPLICATIONS OF
DATA SCIENCE
AREAS OF DATA SCIENCE
 Machine
 Learning
Deep
 Learning
Data Analysis
 Big Data
 Natural Language Processing
 (NLP) Computer Vision
 Business Intelligence (BI)
 Data Engineering
 Data
 Mining
Financial Analytics
 Geospatial Data
 Science Environmental
Data Science
MACHINE LEARNING
The basic idea of machine learning, or ML, is to learn to do a certain task from
data.
MACHINE LEARNING
Herbert Alexander Simon:
"Learning is any process by which a
system improves performance from
experience."

"Machine Learning is concerned with


computer programs that
automatically improve their
performance through experience. "
THE CONCEPT OF LEARNING IN A
ML SYSTEM
The basic idea of machine learning, or ML, is to learn to do a
certain task from data.
Learning = Improving with experience at some
task
- Improve over task T,
- With respect to performance measure, P
- Based on experience, E.
MACHINE LEARNING VS
TRADITIONAL PROGRAMMING
MACHINE LEARNING TYPES
MACHINE LEARNING TYPES

Supervised Learning – “Teach me what to learn”


Unsupervised Learning – “I will find what to learn”
Reinforcement Learning – “I’ll learn from my mistakes at every
step (Hit & Trial!)”
SUPERVISED MACHINE
LEARNING
REAL-LIFE EXAMPLES:

Email Spam (Classification) – The algorithm takes historical spam and


non-spam emails as input. Consequently, it draws patterns in data to
classify spam from others.

Stock Price Prediction (Regression)– Historical business market data


is fed to the algorithm in this method. With proper regression analysis,
the new price for the future is predicted.
UNSUPERVISED MACHINE
LEARNING
REAL-LIFE EXAMPLES:
Customer Segmentation:
• Input: Customer behaviors like purchase history, browsing
patterns.
• Output: Groups or clusters of customers with similar traits.
• Real World: E-commerce platforms like Amazon create
personalized recommendations by segmenting users.

Document Clustering:
• Input: Text documents (e.g., news articles).
• Output: Grouped topics or clusters.
• Real World: Google News uses clustering to organize similar news
stories together.
SUPERVISED VS UNSUPERVISED
REINFOREMENT MACHINE
LEARNING
REAL-LIFE EXAMPLES:
Self-Driving Cars:
• Input: Sensor data (camera, lidar, radar).
• Output: Actions (steering, braking, accelerating).
• Real World: Companies like Tesla and Waymo use reinforcement
learning to train autonomous vehicles.

Game AI:
• Input: Game state (board position, opponent’s moves).
• Output: Next best move.
• Real World: AlphaGo by DeepMind defeated human champions in
the game of Go using reinforcement learning.
AI ML DL AND DATA SCIENCE
AI ML DL AND DATA SCIENCE
JOB PROFILES IN DATA SCIENCE
 Data  Research
 Scientist
Machine Learning  Scientist
Data
 Engineer
Data Analyst  Journalist
Geospatial
 Business Intelligence  Analyst
 (BI) Analyst Data Chief Data Officer
 Engineer (CDO)
 Big Data Engineer
 Quantitative Analyst
 Database
(Quant) Data Architect
 Administrator
AI Engineer (DBA)
 Data Consultant
 Data Product
Manager
HERE IS A BREAKDOWN OF WHERE
DATA SCIENTISTS WORK
• 2% of data scientists work in gaming
• 4% work in consumer goods and retail
• 4% work in academia
• 4% work in government
• 6% work in financial services
• 7% work in pharmaceuticals and healthcare
• 9% work in consulting
• 11% work in a corporate setting
• 13% work in marketing
• 41% work in technology
WOULD YOU BE A GOOD DATA
SCIENTIST
To figure out whether or not you would make a good
data scientist, ask yourself these questions:

• Are you interested in broadening your skills and taking on new


challenges?
• Do you communicate well both visually and verbally?
• Do you enjoy problem-solving and individualized work?
• Are you interested in data analysis and collection?
• Do you have substantial work experience in the areas involved
in data science?
• Do you have a degree in marketing, management information
systems, computer science, statistics, or mathematics?
TOOLS FOR Programming Languages Python,R,Julia

Integrated Development Environments Jupyter Notebooks, Spyder, VS Code, PyCharm

DATA SCIENCE (IDES)

Data Manipulation and Analysis Pandas, NumPy, SQL

Remember that the choice of Data Visualization Matplotlib, Seaborn, Plotly

tools often depends on the


Machine Learning Scikit-learn, TensorFlow , Keras and PyTorch
specific requirements of your
project, the nature of the data, Big Data Processing Apache Spark, Hadoop

and your personal preferences. As Version Control Git

the field of data science evolves,


Containerization Docker
new tools and technologies
continually emerge, so staying Cloud Platforms AWS, Azure, Google Cloud

informed about the latest Notebook Sharing and Collaboration Google Colab,Kaggle Notebooks
developments is also important.
Dashboarding Tableau, Power BI

Workflow Automation Apache Airflow


SKILLS NEEDED FOR DATA
SCIENCE
• Critical Thinking • Attention to Detail
• Problem-Solving • Business Acumen
• Curiosity • Storytelling
• Communication • Resilience
• Teamwork and Collaboration • Listening Skills
• Domain Knowledge • Negotiation Skills
• Ethical Awareness Adaptability • Conflict Resolution
• Time Management • Project Management
• Communication • Patience

You might also like