Machine Learning
Machine Learning
Linear Algebra:
Linear Algebra is an essential field of mathematics,
which defines the study of vectors, matrices, planes,
mapping, and lines required for linear transformation.
It enables ML algorithms to run on a huge number
of datasets.
1. UnSupervised Learning
2. supervised Learning
3. Reinforcement Learning
Vapnik-Chervonenkis (VC) dimension
Vapnik–Chervonenkis dimension) is a measure of the capacity
of a statistical classification algorithm, defined as the
cardinality of the largest set of points that the algorithm can
shatter.
Simplest approach:
1. Generate multiple classifiers
2. Each votes on test instance
3. Take majority as classification
Classifiers different due to different sampling of training data, or randomized parameters within
the classification algorithm
Aim: take simple mediocre algorithm and transform it into a super classifier without requiring
any fancy new algorithm
• In bagging, we use bootstrap sampling to obtain
subsets of data for training a set of base models.
MLP uses
backpropagation for
training the network
ACTIVATION FUNCTIONS
The activation function of the node defines the output of the node. There are 4
most popular activation function:
1. Sigmoid function – Better than step function, it also limits the output from 0
to 1, but it smoothens the value. It is also called probabilities, it is a
continuous function. When we have binary problems, we use sigmoid
function.
4. Rectified linear unit – ReLU is like half of step function, it suppresses the
negative values. It is the most popular and utilized function.
NETWORK TRAINING
1. First an ANN will require a random weight initialization
2. Split the dataset in batches (batch size)
3. Send the batches 1 by 1 to the GPU
4. Calculate the forward pass (what would be the output with the
current weights)
5. Compare the calculated output to the expected output (loss)
6. Adjust the weights (using the learning rate increment or
decrement) according to the backward pass (backward gradient
propagation).
7. Go back to square 2
GRADIENT DESCENT OPTIMIZATION
• Gradient Descent is known as one of the most commonly
used optimization algorithms to train machine learning models
by means of minimizing errors between actual and expected
results.
• It helps in finding the local minimum of a function.
Hyperparameter refers to those parameters that cannot be directly learned from the regular training process.
Ex:
It refers to dropping out the nodes (input and hidden layer) in a neural network
Dropout is a regularization method that approximates training a large number of neural networks
with different architectures in parallel.
Unit V Design And Analysis Of Machine Learning
• Selection of the Response Variable : what should we use as the quality measure (e.g
error, precision and recall, complexity, etc. )
• Choice of Factors and Levels : what are the factors for the defined aim of the study
( factors are hyperparameters when the algorithm is fix and want to find best
hyperparameters, If we are comparing algorithms, the learning algorithm is a factor )
• Choice of Experimental Design : use factorial design unless we are sure that the factors
do not interact
• replication number depends on the dataset size; it can be kept small
• when the dataset is large avoid using small datasets (if possible) which leads to responses
with high variance, and the differences will not be significant and results will not be
conclusive
• Performing the Experiment : doing a few trial runs for some random settings
to check that all is expected, before doing the factorial experiment.
• It is very important not to test your model with the same data that you
used for training.
The data sample is split into 'k' number of smaller samples >>> K-fold Cross Validation.
• Accuracy
• Confusion Matrix
• Precision
• Recall
• F-Score
• AUC(Area Under
the Curve)-ROC
ASSESSING A SINGLE CLASSIFICATION ALGORITHM
1. Training time
2. Inference time
3. Inference accuracy (F1 score)
T TEST
• A t-test is a type of inferential statistic used
to determine if there is a significant
difference between the means of two
groups, which may be related in certain
features.
• If t-value is large => the two groups belong
to different groups.
• If t-value is small => the two groups belong
to same group.
• three types of t-tests
1. Independent samples t-test: compares the
means for two groups.
2. Paired sample t-test: compares means from
the same group at different times (say, one
year apart).
3. One sample t-test test: the mean of a single
group against a known mean.
MCNEMAR’S TEST
The McNemar Test is a statistical test used to determine if the proportions of categories in
two related groups significantly differ from each other. To use this test, you should have two
group variables with two or more options.