Machine Learning - Overview
Machine Learning - Overview
Generative algorithm
Covariance matrix =
- K-nn
- Start at a data point
- Grow a sphere (no fixed size)
- Stop when over k-points are included in the
sphere
- Majority vote
- Naive Bayes
- For each class
- For each feature → compute p(x|y) (just count the occurences)
Discriminative algorithms
Linear classifiers
Logistic regression
- Cost function → come up with a good weight vector
- Gradient descent
- Algorithm :
- For each class :
- Start with random values for theta
- Compute Cost() → if < tolerance : break
- Cost use hypothesis (which uses sigmoid function) on each element
- Sigmoid function output the probability that p(x|y) - check if it is
correct through cost function
- Compute new values of theta with gradient descent
- → linear discriminant is a linear equation
- Decision boundary = p(y1|x) = p(y2/x)
Multiclass classifiers
- One versus one
- One versus the rest
SVM
- Points not linearly separable → penalty for missing a
point
- Hinge loss
Non linear classifiers
Decision trees
- ID3
- Decide which attribute to split on
- For each value → new child node
- Split training examples
- If pure / acceptable → stop
- Best attribute ?
- Purity
- On each attribute
- Calculate entropy
- Information gain
- Penalize attributes with many children - attributes:
-
- Avoid overfitting : pruning
- Measure performance when removing the node and children from the tree
- Remove node resulting in greatest improvement
- Repeat til further pruning is harmful
MLP
- Single class
- Activation function
- Linear
- Multiple class
- Non linear
Multi layer perceptron
- 1. Calculate position of data point to each of the decision boundaries
- 2. Combine results of 1. to determine position of the data point to both decision
boundaries and determine class
- Passes
- Feed-forward
- Backpropagation
Combine classifier
- Steps
- Take input pattern
- Run through all classifiers
- Put all these in a combiner
- Kullback leibler
- Majority voting
PCA
- Find eigenpairs of the covariance XXT (X being zero-mean)
- Pivotal condensation
- Power iteration
-
- Sensitive to large values → scaling
- Scree plot
Clustering
- Proximity measure
- Partition
- K-means
- Hierarchical techniques
- Single linkage
- Average linkage
- Complete linkage
Pros Cons
Evaluation
Error estimation
- Test/training set division
-
- Bootstrapping
- Error = avg(errors)
- k-fold cross validation
- Error = avg(errors)
- LOO cross-validation
- Double cross validation
- Cross validation inception
- Learning curves
- Feature curves
- Bias-variance dilemma
- Confusion matrices
- Rejection curves
- Outlier
- Ambiguity
- ROC curve
- Plot error of each class in a graph
- AUC
Evaluate a trained Machine Learning model,
explain why a training and test dataset is needed,
explain what crossvalidation is, and bootstrapping
explain and compute learning curves
avoid overfitting by performing regularisation
explain reject curves and ROC curves