0% found this document useful (0 votes)
52 views

Types of ML

The document outlines a 7-step process for machine learning model development: 1) Collect accurate data, 2) Prepare the data through visualization, balancing classes, and separating into training and evaluation sets, 3) Choose an appropriate model type, 4) Train the model, 5) Evaluate the model's accuracy on the evaluation set, 6) Tune hyperparameters if needed to improve accuracy, 7) Use the trained model to make predictions on new data. There are three main types of machine learning: supervised learning which uses labeled training data, unsupervised learning which finds patterns in unlabeled data, and reinforcement learning which learns through trial-and-error interactions with an environment.

Uploaded by

chandana kiran
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

Types of ML

The document outlines a 7-step process for machine learning model development: 1) Collect accurate data, 2) Prepare the data through visualization, balancing classes, and separating into training and evaluation sets, 3) Choose an appropriate model type, 4) Train the model, 5) Evaluate the model's accuracy on the evaluation set, 6) Tune hyperparameters if needed to improve accuracy, 7) Use the trained model to make predictions on new data. There are three main types of machine learning: supervised learning which uses labeled training data, unsupervised learning which finds patterns in unlabeled data, and reinforcement learning which learns through trial-and-error interactions with an environment.

Uploaded by

chandana kiran
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Step 1: Collect Data

For the given problem, you would like to solve, you have to obtain data from your
own sources to feed into your machine. And more important is the data that are your
feeding into machine should be right data. Otherwise your model won’t work. The
sources of data can be newspaper, or from a web. It should be in a precise form and
accurate.

Step 2: Prepare the data

This is a good time to visualize your data and check if there are correlations between
the different characteristics that we obtained. It will be necessary to make a selection
of characteristics since the ones you choose will directly impact the execution times
and the results. Additionally, you must balance the amount of data we have for each
result -class- so that it is significant as the learning may be biased towards a type of
response and when your model tries to generalize knowledge it will fail.

You must also separate the data into two groups: one for training and the other for
model evaluation which can be divided approximately in a ratio of 80/20 but it can
vary depending on the case and the volume of data we have.

At this stage, you can also pre-process your data by normalizing, eliminating
duplicates, and making error corrections.

Step 3: Choose the model

There are several models that you can choose according to the objective that you
might have: you will use algorithms of classification, prediction, linear
regression, clustering, i.e. k-means or K-Nearest Neighbor, Deep Learning, i.e
Neural Networks, Bayesian, etc.

There are various models to be used depending on the data you are going to
process such as images, sound, text, and numerical values. In the following table,
we will see some models and their applications that you can apply in your projects:

Step 4 Train your machine model

You will need to train the datasets to run smoothly and see an incremental
improvement in the prediction rate. Remember to initialize the weights of your model
randomly -the weights are the values that multiply or affect the relationships between
the inputs and outputs- which will be automatically adjusted by the selected
algorithm the more you train them.

Step 5: Evaluation

You will have to check the machine created against your evaluation data set that
contains inputs that the model does not know and verify the precision of your already
trained model. If the accuracy is less than or equal to 50%, that model will not be

1|Page
useful since it would be like tossing a coin to make decisions. If you reach 90% or
more, you can have good confidence in the results that the model gives you.

Step 6: Parameter Tuning

If during the evaluation you did not obtain good predictions and your precision is not
the minimum desired, it is possible that you have overfitting -or underfitting problems
and you must return to the training step before making a new configuration of
parameters in your model. You can increase the number of times you iterate your
training data- termed epochs. Another important parameter is the one known as the
“learning rate”, which is usually a value that multiplies the gradient to gradually bring
it closer to the global -or local- minimum to minimize the cost of the function.

Increasing your values by 0.1 units from 0.001 is not the same as this can
significantly affect the model execution time. You can also indicate the maximum
error allowed for your model. You can go from taking a few minutes to hours, and
even days, to train your machine. These parameters are often called
Hyperparameters. This “tuning” is still more of an art than a science and will improve
as you experiment. There are usually many parameters to adjust and when
combined they can trigger all your options. Each algorithm has its own parameters to
adjust. To name a few more, in Artificial Neural Networks (ANNs) you must define in
its architecture the number of hidden layers it will have and gradually test with more
or less and with how many neurons each layer. This will be a work of great effort and
patience to give good results.

Step 7: Prediction or Inference

You are now ready to use your Machine Learning model inferring results in real-life
scenarios.

2|Page
As with any method, there are different ways to train machine learning
algorithms, each with their own advantages and disadvantages. To
understand the pros and cons of each type of machine learning, we must
first look at what kind of data they ingest. In ML, there are two kinds of data
— labeled data and unlabeled data.

Labeled data has both the input and output parameters in a completely
machine-readable pattern, but requires a lot of human labor to label the
data, to begin with. Unlabeled data only has one or none of the parameters
in a machine-readable form. This negates the need for human labor but
requires more complex solutions.

There are also some types of machine learning algorithms that are used in
very specific use-cases, but three main methods are used today.

Supervised learning

Supervised learning is one of the most basic types of machine learning. In


this type, the machine learning algorithm is trained on labeled data. Even
though the data needs to be labeled accurately for this method to work,
supervised learning is extremely powerful when used in the right
circumstances.

In supervised learning, the ML algorithm is given a small training dataset to


work with. This training dataset is a smaller part of the bigger dataset and
serves to give the algorithm a basic idea of the problem, solution, and data
points to be dealt with. The training dataset is also very similar to the final
dataset in its characteristics and provides the algorithm with the labeled
parameters required for the problem.

The algorithm then finds relationships between the parameters given,


essentially establishing a cause and effect relationship between the
variables in the dataset. At the end of the training, the algorithm has an
idea of how the data works and the relationship between the input and the
output.

This solution is then deployed for use with the final dataset, which it learns
from in the same way as the training dataset. This means that supervised
machine learning algorithms will continue to improve even after being
deployed, discovering new patterns and relationships as it trains itself on
new data.

Unsupervised Learning
Unsupervised machine learning holds the advantage of being able to work
with unlabeled data. This means that human labor is not required to make
3|Page
the dataset machine-readable, allowing much larger datasets to be worked
on by the program.

In supervised learning, the labels allow the algorithm to find the exact
nature of the relationship between any two data points. However,
unsupervised learning does not have labels to work off of, resulting in the
creation of hidden structures. Relationships between data points are
perceived by the algorithm in an abstract manner, with no input required
from human beings.

The creation of these hidden structures is what makes unsupervised


learning algorithms versatile. Instead of a defined and set problem
statement, unsupervised learning algorithms can adapt to the data by
dynamically changing hidden structures. This offers more post-deployment
development than supervised learning algorithms.

Semi-supervised learning: A small amount of data are labeled. Computers only need
to find features through labeled data and then classify other data accordingly. This
method can make predictions more accurate and is the most commonly used method.
If there are 100 photos, 10 of them which are elephants and which are giraffes are
labeled. Through the characteristics of these 10 photos, the machine identifies and
classifies the remaining photos. Because there is already a basis for identification, the
predicted results are usually more accurate than un-supervised learning.

Reinforcement Learning

Reinforcement learning directly takes inspiration from how human beings learn from
data in their lives. It features an algorithm that improves upon itself and learns from
new situations using a trial-and-error method. Favorable outputs are encouraged or
‘reinforced’, and non-favorable outputs are discouraged or ‘punished’.

Based on the psychological concept of conditioning, reinforcement learning works by


putting the algorithm in a work environment with an interpreter and a reward system.
In every iteration of the algorithm, the output result is given to the interpreter, which
decides whether the outcome is favorable or not.

In case of the program finding the correct solution, the interpreter reinforces the
solution by providing a reward to the algorithm. If the outcome is not favorable, the
algorithm is forced to reiterate until it finds a better result. In most cases, the reward
system is directly tied to the effectiveness of the result.

In typical reinforcement learning use-cases, such as finding the shortest route


between two points on a map, the solution is not an absolute value. Instead, it takes
on a score of effectiveness, expressed in a percentage value. The higher this
percentage value is, the more reward is given to the algorithm. Thus, the program is
trained to give the best possible solution for the best possible reward.

4|Page

You might also like