0% found this document useful (0 votes)
26 views

Machine Learning Unit-1.1

Uploaded by

sahil.utube2003
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Machine Learning Unit-1.1

Uploaded by

sahil.utube2003
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Overview of Course

1. Introduction
2. Linear Regression and Decision Trees
3. Instance based learning Feature Selection
4. Probability and Bayes Learning
5. Support Vector Machines
6. Neural Network
7. Introduction to Computational Learning Theory
8. Clustering
UNIT-1: Overview of Course
• Introduction to Machine Learning. Human Learning and its
types.

• Machine Learning and its types( Supervised, Unsupervised


and Reinforcement Learning).

• Well posted learning problems.

• Applications of Machine Learning. Issues in Machine


Learning

• Basic Types of Data in Machine Learning: Numerical and


Categorical data, Data Quality and Remediation.
Figure1:- Machine
• Artificial Intelligence is the concept of creating
smart intelligent machines. Ability of machine to
imitate intelligence like Intelligent human brain.

• Machine Learning is a subset of artificial


intelligence that helps you build
AI-driven applications. It is the application of AI
that allows system to automatically learn and
improve from experience.

• Deep Learning is a subset of Machine Learning


that uses complex algorithms and deep neural
network to train a model.
Difference Between Machine Learning
And Artificial Intelligence

• Artificial Intelligence is a concept of creating


intelligent machines that stimulates human
behaviour whereas
• Machine learning is a subset of Artificial
intelligence that allows machine to learn from
data without being programmed.
• Ever since computers were invented, we have
wondered whether they might be made to learn. If we
could understand how to program them to learn-to
improve automatically with experience-the impact
would be dramatic.

• Imagine computers learning from medical records


which treatments are most effective for new diseases.

• A successful understanding of how to make computers


learn would open up new uses of computers and new
levels of competence and customization.
• We do not know yet, how to make computers learn nearly as well as people
learn.

• However, algorithms have been invented that are effective for certain types of
learning tasks, and a theoretical understanding of learning is beginning to
emerge.

• Many practical computer programs have been developed to exhibit useful


types of learning, and significant commercial applications have begun to
appear.

• For problems such as speech recognition, algorithms based on machine


learning perform all other approaches that have been attempted to date.

• In the field known as data mining, machine learning algorithms are being used
routinely to discover valuable knowledge from large commercial databases
containing equipment maintenance records, loan applications, financial
transactions, medical records, patient record etc.
• A few specific achievements provide a glimpse of the
state of the art:

• Programs have been developed that successfully learn


to recognize spoken words.

• Predict recovery rates of pneumonia patients, detect


fraudulent use of credit cards, drive autonomous
vehicles on public highways.

• Play games such as backgammon at levels approaching


the performance of human world champions
What is data science?
• Data science is the field of applying advanced analytics techniques and
scientific principles to extract valuable information from data for decision-
making, strategic planning and other business uses.

• Data science combines math and statistics, specialized programming,


advanced analytics, artificial intelligence (AI), and machine learning to
uncover actionable insights hidden in an organization’s data.

• Data science is the study of data to extract meaningful insights for


business.

• It combines principles and practices from the fields of mathematics,


statistics, artificial intelligence, and computer engineering to analyze large
amounts of data.

• These insights can be used to guide decision making and strategic planning.
Prerequisite for Data Science
1.Machine Learning:- Machine learning is the backbone of data science.
Data Scientists need to have a solid grasp of ML in addition to basic
knowledge of statistics.

2. Modeling:- Mathematical models enable you to make quick calculations


and predictions based on what you already know about the data.

3. Statistics:- A sturdy handle on statistics can help you extract more


intelligence and more meaningful results.

4. Programming:- The most common programming languages are Python,


and R. Python is especially popular because it’s easy to learn, and it
supports multiple libraries for data science and ML.

5. Databases:- A capable data scientist needs to understand how databases


work, how to manage them, and how to extract data from them.
Different Roles/Jobs in Data Science
• Data Scientist:
 Data scientists require computer science and pure science skills
beyond those of a typical business analyst or data analyst.

 The data scientist must also understand the specifics of the


business, such as Manufacturing, eCommerce, healthcare,
Agriculture domain knowledge.

• Job role: Determine what the problem is, what questions need
answers, and where to find the data. Also, they mine, clean, and
present the relevant data.

• Skills needed: Programming skills (SAS, R, Python), storytelling


and data visualization, statistical and mathematical skills,
knowledge of Hadoop, SQL, and Machine Learning.
• Data analyst:
• Job role: Analysts bridge the gap between the data scientists and the
business analysts, organizing and analyzing data to answer the questions.
They take the technical analyses and turn them into qualitative action
items.

• Skills needed: Statistical and mathematical skills, programming skills (SAS,


R, Python), plus experience in data wrangling and data visualization.

• Data Engineer:
• Job role: Data engineers focus on developing, deploying, managing, and
optimizing the organization’s data infrastructure and data pipelines.
Engineers support data scientists by helping to transfer and transform
data for queries.

• Skills needed: NoSQL databases (e.g., MongoDB, Cassandra DB),


programming languages such as Java and Scala, and frameworks (Apache
Hadoop).
• Business Managers:
 Their primary responsibility is to collaborate with the data science
team to characterize the problem and establish an analytical
method.

 Their goal is to ensure projects that are completed on time by


collaborating with data scientists and IT managers.

• IT Managers:
 They are primarily responsible for developing the infrastructure and
architecture to enable data science activities.

 Data science teams are constantly monitored and resourced


accordingly to ensure that they operate efficiently and safely.

 They also be in charge of creating and maintaining IT environments


for data science teams.
Data Science tools
• Data scientists rely on the following popular programming
languages:

• Open source tools support pre-built statistical modeling,


machine learning, and graphics capabilities.

1. R Studio: An open source programming language and


environment for developing statistical computing and
graphics.

2. Python: It is a dynamic and flexible programming


language. The Python includes numerous libraries, such as
NumPy, Pandas, Matplotlib, for analyzing data quickly.
Data Science Tools

• Data Analysis: SAS, Jupyter, R Studio, MATLAB, Excel,


RapidMiner.

• Data Warehousing: Informatica/ Talend, AWS Redshift

• Data Visualization: Jupyter, Tableau, Cognos, RAW

• Machine Learning: Spark MLib, Mahout, Azure ML


studio
What is Machine Learning?
• Learning:- Learning is any process by which a system improves performance from experience. -
Herbert Simon.

• Learning is the ability to improve one's behaviour based on experience.

• Machine learning is a discipline of computer science that uses computer algorithms/techniques


and analytics to build predictive models that can solve business problems.

• Machine Learning explores algorithms that can


– learn from data / build a model from data
– use the model for prediction, decision making or solving some problem.

• Learning is used when:


– Human expertise does not exist (navigating on Mars),
– Humans are unable to explain their expertise (speech recognition)
– Solution changes in time (routing on a computer network)
– Solution needs to be adapted to particular cases (user biometrics)

• Definition by Tom Mitchell on machine learning:


• “A computer program is said to learn from experience E with respect to some class of tasks T
and performance measure P, if its performance at tasks T, as measured by P, improves with
experience E.”
Components of a learning problem
• Task: The behaviour or task being improved.
– For example: classification, acting in an
environment

• Data: The experiences that are being used to


improve performance in the task.

• Measure of improvement :
– For example: increasing accuracy in prediction,
acquiring new, improved speed and efficiency.
How Does Machine Learning Work?
• Machine learning accesses vast amounts of data (both structured and unstructured) and learns from it to
predict the future.

• It learns from the data by using multiple algorithms and techniques.

• Below is a diagram that shows how a machine learns from data.

Machine Learning
Past Data Algorithm Output

• https://www.simplilearn.com/tutorials/artificial-intelligence-tutorial/ai-vs-machine-learning-vs-deep-learni
ng
• Reference Books:
1. Machine-Learning-Tom-Mitchell Publisher: McGraw-Hill
2. Introduction to Machine Learning-The MIT Press (2014)
Programs vs learning algorithms
Algorithmic solution

Data
Computer Output
Program

Machine Learning solution

Data
Computer Program
Output
Domains and ML Applications
Domain:- Automobile
Example: A robot driving learning problem
• Task T: driving on public four-lane highways using
vision sensors

• Performance measure P: average distance traveled


before an error.

• Training experience E: a sequence of images and


steering commands recorded while observing a
human driver.
Domain: Health Care

• Task T: Diagnose a disease


– Input: symptoms, lab measurements, test result.
– Output: One of set of possible diseases, or
“none of the above”

• Data: Historical medical records.

• Learn: which future patients will respond best to


which treatments
Classification

• Example: Credit
scoring
• Differentiating
between low-risk
and high-risk
customers from their
income and savings

Discriminant: IF income > θ1 AND savings > θ2


THEN low-risk ELSE high-risk
22
Face Recognition

Training examples of a person

Test images

AT&T Laboratories, Cambridge UK


http://www.uk.research.att.com/facedatabase.html

23
Clustering
Clustering
Regression

• Example: Price of a
used car
y = wx+w0
• x : car attributes
y : price
y = g (x | q )
g ( ) model,
q parameters
26
• Regression:
Some other applications
• Fraud detection : Credit card Providers
 Determine whether or not someone will
default on a home mortgage.
 Understand consumer sentiment based off of
unstructured text data.
 Determine customers behavior based on
previous records/pattern.
• Speech recognition:
• Face Recognition:
• Weather Forecasting
• NLP:
detect where entities are mentioned in NL
detect what facts are expressed in NL
detect if a product/movie review is positive,
negative, or neutral
• Financial:
Predict if a stock will rise or fall?
 Predict if a user will click on an ad or not?

You might also like