l14 Machine Learning
l14 Machine Learning
Neural Networks
13-12-2022
– More data
– Faster computers (GPUs)
– Some improvements:
● Relu ‘activation functions’
● Drop-out
● batch-normalization
Supervised Neural Networks
● Non-linear models for classification and regression
● Work well for very large datasets
● Non-convex optimization
output
● But, computing a series of weighted sums is the same
linear model?
Applying Nonlinear Function
●After computing a weighted sum for each hidden unit, a
nonlinear function is applied to the result
● Common choices:
– Rectifying nonlinearity (aka rectified linear unit or
relu): cuts off values below zero
– Tangens hyperbolicus (tanh): saturates to –1 for low
input values and +1 for high input values
●The result of this function is then used in the weighted
sum that computes the output, ŷ
Nonlinear activation function
Formula for MLP with tanh
Nonlinearity
● x - input features
● ŷ - computed output
● h - intermediate computations
Can have arbitrary many layers
Estimating complexity
● Count the weights: