0% found this document useful (0 votes)
13 views

Machine Learning Unit-1.2

Uploaded by

sahil.utube2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Machine Learning Unit-1.2

Uploaded by

sahil.utube2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

What is Machine Learning?

• ML is a specific subset of AI that trains


machines how to learn.

• Optimize a performance criterion using


example data or past experience.

• Role of Computer science: Efficient algorithms


to
– Solve the optimization problem
– Representing and evaluating the model for inference.
• Types of Machine Learning:
• Supervised learning:
• Supervised learning is the type of ML
algorithms that require both input and output
data is initially provided.

• Basically, data engineers create an algorithm,


then train it with a labeled dataset — the one
that has actual input and output parameters.

– Given: Training data + Desired outputs (labels)


Unsupervised Learning:
• Unsupervised learning is a type of ML algorithm which works with
the input data having no examples or suggestions of the expected
output.

• Its primary aim is to combine/gather the data into categories.

Unsupervised learning divides into the following subtypes:

• Association:- This allows discovering fascinating interconnections


between portions of data in large datasets.

• Clustering:- This kind of algorithm is very spread since it allows


simply grouping the pieces of data according to a certain trait.

– Given: Training data (without desired outputs)


Semi-supervised learning:
– Given: Training data + a few desired outputs
Reinforcement Learning :
• It is employed by various software and machines to find the best
possible behavior or path it should take in a specific situation.

• Reinforcement learning differs from supervised learning in a way


that in supervised learning the training data has the answer key
with it so the model is trained with the correct answer itself.

• whereas in Reinforcement learning, there is no answer but the


reinforcement agent decides what to do to perform the given
task.

• In the absence of a training dataset, it is bound to learn from its


experience.
• In reinforcement learning, the algorithm (in this
context also often referred to as agent) learns
through trial-and-error using feedback to its
own actions.

• Rewards and punishment operate as signals for


desired and undesired behavior.

• The best context to understand reinforcement


learning is in a game with a clear objective and
a point system
Example: The problem is as follows: We have an agent and a
reward, with many hurdles in between. The agent is supposed to
find the best possible path to reach the reward. The following
problem explains the problem more easily.
• The above image shows the robot, diamond, and fire.

• The goal of the robot is to get the reward that is the


diamond and avoid the hurdles that are fired.

• The robot learns by trying all the possible paths and then
choosing the path which gives him the reward with the
least hurdles.

• Each right step will give the robot a reward and each wrong
step will subtract the reward of the robot.

• The total reward will be calculated when it reaches the final


reward that is the diamond.
Applications in self-driving cars:
• Some of the autonomous driving tasks where
reinforcement learning could be applied include
trajectory optimization, motion planning, dynamic
pathing, controller optimization, and scenario-based
learning policies for highways.

• For example, parking can be achieved by learning


automatic parking policies.

• Lane changing can be achieved using Q-Learning while


overtaking can be implemented by learning an
overtaking policy while avoiding collision and
maintaining a steady speed thereafter.
• Reinforcement Learning applications in trading and
finance:

• Supervised time series models can be used for predicting


future sales as well as predicting stock prices.

• However, these models don’t determine the action to take


at a particular stock price.

• Enter Reinforcement Learning (RL). An RL agent can decide


on such a task; whether to hold, buy, or sell.

• The RL model is evaluated using market benchmark


standards in order to ensure that it’s performing optimally.
• Main points in Reinforcement learning –

• Input: The input should be an initial state from which the


model will start

• Output: There are many possible outputs as there are a variety


of solutions to a particular problem

• Training: The training is based upon the input, The model will
return a state and the user will decide to reward or punish the
model based on its output.

• The model keeps continues to learn.

• The best solution is decided based on the maximum reward.


Applications of Machine Learning

• Email filtering
• Speech recognition,
• Face recognition
• Disease detection
• Fraud detection
• Computer vision,
• Self-driven cars
• Weather Forecasting
• NLP
• Amazon product recommendation, etc.
Issues in Machine Learning
• Healthcare,
• Education,
• Finance,
• Automobile,
• Marketing, Shipping,
• Infrastructure,
• Automation, etc.
• Big companies like Amazon
• Facebook,
• Google
• Adobe, etc.

But everything in this world has bright as well as dark sides.


Issues in Machine Learning

1. Inadequate Training Data


 Noisy Data- It is responsible for an inaccurate prediction that affects the
decision as well as accuracy in classification tasks.
 Incorrect data- It is also responsible for faulty programming and results
obtained in machine learning models. Hence, incorrect data may affect the
accuracy of the results also.

2. Poor/ Lack of quality of data

3. Overfitting and Underfitting

4. Irrelevant features

5. Lack of Skilled Resources

6. Inadequate Infrastructure
BASIC TYPES OF DATA IN MACHINE LEARNING

• A dataset is a collection of related information or


records.

• The information may be on some entity or some


subject area.

• For example (Fig.2.2), we may have a data set on


students in which each record consists of information
about a specific student.

• Again, we can have a dataset on student performance


which has records providing performance.
• Each row of a data set is called a record.

• Each data set also has multiple attributes, each of which gives information on a
specific characteristic.

• For example, in the dataset on students, there are four attributes.

• Attributes can also be termed as feature, variable, dimension or field.

• Both the datasets, Student and Student Performance, are having four features or
dimensions; hence they are told to have four dimensional data space.
Types of Data in ML

• Data can broadly be divided into following two


types:
1. Qualitative data
2. Quantitative data
Qualitative data
• Qualitative data provides information about the quality of an object or
information which cannot be measured.

• For example, if we consider the quality of students in terms of


performance such as ‘Good’, ‘Average’, and ‘Poor’,
• It falls under the category of qualitative data.

• Also, name or roll number of students are information that cannot be


measured using some scale of measurement. So they would fall under
qualitative data.

• Qualitative data is also called categorical data.

• Qualitative data can be further subdivided into two types as follows:


1. Nominal data
2. Ordinal data
Nominal data
• Nominal data is one which has no numeric value, but a named
value.

• It is used for assigning named values to attributes. Nominal


values cannot be quantified.

• Examples of nominal data are


1. Blood group: A, B, O, AB, etc.
2. Nationality: Indian, American, British, etc.
3. Gender: Male, Female, Other.

• A special case of nominal data is when only two labels are


possible, e.g. pass/fail as a result of an examination.

• This sub-type of nominal data is called ‘dichotomous’.


Ordinal data

• Ordinal data assigns named values to attributes but unlike nominal


data, they can be arranged in a sequence of increasing or decreasing
value so that we can say whether a value is better than or greater than
another value.

• Examples of ordinal data are


1. Customer satisfaction: ‘Very Happy’, ‘Happy’, ‘Unhappy’, etc.
2. Grades: A, B, C, etc.
3. Hardness of Metal: ‘Very Hard’, ‘Hard’, ‘Soft’, etc.

• Like nominal data, basic counting is possible for ordinal data.

• Hence, the mode can be identified. Since ordering is possible in case of


ordinal data, median, and quartiles can be identified in addition.

• Mean can still not be calculated.


Quantitative data
• Quantitative data relates to information about the
quantity of an object – hence it can be measured.

• For example, if we consider the attribute ‘marks’,


it can be measured using a scale of measurement.

• Quantitative data is also termed as numeric data.

• There are two types of quantitative data:


1. Interval data
2. Ratio data
Interval data
• Interval data is numeric data for which not only the order is known, but the exact difference
between values is also known.

• An ideal example of interval data is Celsius temperature. The difference between each value
remains the same in Celsius temperature.

• For example, the difference between 12°C and 18°C degrees is measurable and is 6°C as in the
case of difference between 15.5°C and 21.5°C. Other examples include date, time, etc.

• For interval data, mathematical operations such as addition and subtraction are possible. For
that reason, for interval data, the central tendency can be measured by mean, median, or
• mode. Standard deviation can also be calculated.

• However, interval data do not have something called a ‘true zero’ value.
• For example, there is nothing called ‘0 temperature’ or ‘no temperature’. Hence, only addition
and subtraction applies for interval data. The ratio cannot be applied. This means, we can say
a temperature of 40°C is equal to the temperature of 20°C + temperature of 20°C.

• However, we cannot say the temperature of 40°C means it is twice as hot as in temperature of
20°C.
Ratio data
• Ratio data represents numeric data for which exact
value can be measured.
• Absolute zero is available for ratio data.

• Also, these variables can be added, subtracted,


multiplied, or divided.

• The central tendency can be measured by mean,


median, or mode and methods of dispersion such as
standard deviation.

• Examples of ratio data include height, weight, age,


salary, etc.
• Apart from the approach detailed above, attributes can also be categorized into
types based on a number of values that can be assigned.

• The attributes can be either discrete or continuous based on this factor.

• Discrete attributes can assume a finite or countably infinite number of values.


Nominal attributes such as roll number, street number, pin code, etc. can have a
finite number of values whereas numeric attributes such as count, rank of
students, etc. can have countably infinite values.

• A special type of discrete attribute which can assume two values only is called
binary attribute. Examples of binary attribute include male/ female,
positive/negative, yes/no, etc.

• Continuous attributes can assume any possible value which is a real number.
Examples of continuous attribute include length, height, weight, price, etc.

• In general, nominal and ordinal attributes are discrete.

• On the other hand, interval and ratio attributes are continuous, barring a few
exceptions, e.g. ‘count’ attribute.

You might also like